





























The Journal of 


Applied Psychology 


Vol 1V MARCH, 1920 Sia 














EDITORS 


G. STANLEY HALL L. R. GEISSLER 


And a Board of Co-opxrating Editors 


WORCESTER, MASS. 
FLORENCE CHANDLER, Publisher 


Entered as second class matter June 9, 191 , at the post office at Worcester, Mass. under 
the Act’of March 3, 1879. 





en 

















THE 


JOURNAL OF APPLIED PSYCHOLOGY 


VoL. IV MARCH, 1920 No. 1 








EMPLOYMENT PSYCHOLOGY IN THE RUBBER 
INDUSTRY 


By Harotp E. Burtt, Ohio State University 


1. Introduction 

The writer spent several months in the spring of 1919 as 
Consulting Psychologist of a large Canadian rubber company. 
Most of the time was devoted to the development of tests 
for the use of the employment office of a rubber tire factory. 
It was not originally intended to publish an account of this 
work, because it involved no radically new kinds of tests, 
because there were no inter-correlations of theoretical interest 
and because if detailed test procedure was published it might 
be used by the incompetent or might even reach the hands of 
prospective employees. However, after reading Link’s recent 
book on Employment Psychology, it seemed worth while to 
give a general account of the methods used at the tire factory, 
as they illustrate the solution of problems similar to those 
recounted by Link, but with a quite different statistical ap- 
proach. In both instances the general principle was to cor- 
relate test scores with known vocational ability, but in one 
case a wide range of factory and clerical operations was cov- 
ered with rather simple statistical treatment and in the present 
instance fewer operations were studied, but with the use of 
partial correlations. Furthermore the personal experiences 
of the writer in methods of organizing the work, enlisting 
the co-operation of executives, and arranging for the continu- 
ance of the work might be of interest. In the near future 
when this field is quite apt to be exploited by those with too 
little ability and too much enthusiasm, there will be a need 


1 








2 BURTT 


for a restraining influence in the form of conservative ac- 
counts of personnel work done with strictly scientific pro- 
cedure. It is hoped that the following will be a slight con- 
tribution of this sort. 


2. Preliminary Work 


The preliminary work at the factory fell into two parts,— 
getting personally oriented and getting in the proper rapport 
with those in authority. It was necessary at the outset to 
become familiar with the different types of work carried on 
at the factory and to learn the technical terminology. This 
knowledge was necessary for later analysis of the mental fac- 
tors involved in various factory operations. One of the 
assistants to the Factory Superintendent was detailed as aide 
to the writer,—conducting him about the factory, explaining 
the work and introducing him to various foremen so that it 
was possible to get acquainted and talk over the mental and 
physical qualifications for various kinds of work. A further 
study was made of the methods of accounting, time study and 
evaluating piece-work. These facts were of value in getting 
production ratings of men who were tested, and following up 
new workers. Data were secured on labor turnover to see 
what departments apparently needed the most attention. It 
was necessary, also, to become familiar with the executive 
organization in order to know to whom to go for information 
or to get various things done. 

The second preliminary step was to get in the proper rapport 
with the personnel whose co-operation would be needed in 
the work. The factory at the time was managed by an operat- 
ing council of five men, rather than a single manager. An 
early meeting with this council was arranged in order to dis- 
cuss plans. These men readily saw the purpose and the gen- 
eral methods of the work. Shortly after this a meeting of all 
the foremen was called. They had an organization of their 
own and this meeting was scheduled as one of their regular 
meetings with the writer as an outside speaker. A talk was 
given about personnel work in the army leading up to the 
question of round pegs in square holes in industry. The mat- 
ter was finally brought home to them and the proposed work 
explained in detail and their co-operation enlisted. Practically 
all of them were much interested in the matter and this pre- 
liminary discussion was all that was ever necessary in obtain- 
ing their cu-vperation in the work. This rapport was also 
furthered by the writer’s frequent presence in the factory. He 











a a noe me 








PSYCHOLOGY IN THE RUBBER INDUSTRY 3 


visited different departments very often, talked with the fore- 
men, watched the workmen and made himself felt generally 
as a part of the organization. Shortly after the foremen’s 
meeting a preliminary series of tests covering a rather wide 
range of mental factors was given to the executives, to most 
of the foremen, and to a sampling of the factory workers. 
This served the purpose of familiarizing them with the nature 
of mental tests. This was especially desirable because from 
information that reached the writer indirectly it was found 
that the conception of mental tests held by the majority of 
the factory personnel varied from phrenology to microscopic 
x-ray examination of the brain. The results of this prelim- 
inary series were of some statistical import, but their greatest 
function lay in familiarizing those whose co-operation was 
especially needed with the methods. 


3. Obtaining the Criterion 


As early as possible in the research (as soon as the foremen 
had been tested) vocational ratings of a large number of 
workers in the factory were obtained. The writer learned a 
bitter lesson in the Air Service in testing large numbers of 
men and being unable to get subsequent vocational ratings. 
It was highly advantageous to have the criterion before any 
tests were given at all. This made it possible to send to the 
factory the names of men whom it was desired to test. Thus 
when only a sampling of those engaged in a given type of work 
was to be tested, it was feasible to get a sampling correspond- 
ing roughly to a normal distribution curve for ability in that 
work. Otherwise the sampling might comprise those at one 
extreme of ability or only those in the middle range with 
resulting lower correlations due to the homogeneity of the 
group. Furthermore if some of the men desired were un- 
available (e. g., were on the night shift) it was possible to 
substitute others of similar vocational ability. 

Two measures of vocational ability were desired in order 
to correlate with two scores in each test, and correct the co- 
efficient for attenuation. In several operations three ratings 
besides the production ratings (piece-work) were available ; the 
inspector who had perhaps a dozen men under him furnished 
one rating, the foreman who had charge of finishing or build- 
ing or cord building, etc., furnished a second, and the head 
foreman of the whole floor a third. One of these could then 
be combined with the production rating and the other two 
averaged. 














4 BURTT 


The rating sheet used was similar to that described by 
Miner.’ Five adjacent columns each ten millimeters in width 
were headed “ high test 5th,” “ middle 5th,” etc. The names 
of the workmen were typed in a column at the left and the 
foremen requested to place after each man’s name a cross in 
the appropriate column. The foremen were further encour- 
aged to grade a man’s ability as finely as possible by placing 
the cross to the right or to the left of the column. These 
ratings were measured in millimeters from the edge of the 
chart and furnished a score between 0 and 100. Inasmuch as 
the ratings by different foremen and inspectors varied con- 
siderably,—some rating all their men higher than did others,— 
each foreman’s ratings were averaged, the standard deviation 
computed and the original measures converted into ratios of 
individual deviation to standard deviation for that foreman. 
The production ratings were treated in the same manner. It 
was then valid to average one foreman’s rating with the piece- 
work score and to average together the other two foremen’s 
ratings. These two figures could then be correlated with the 
two test measures (infra). This method was used in all cor- 
relations where the coefficient was corrected for attenuation. 
In some factory operations which were not to any considerable 
extent specialized or in which only a small number of workers 
were engaged, the foremen were asked merely to select a few 
of the best and a few of the worst workmen. The desirability 
of correcting the coefficient of correlation whenever possible 
was shown by the low correlations sometimes found between 
the ratings of the same workmen by different foremen. These 
coefficients were sometimes as low as .60 whereas correlations 
between two parts of the same test were generally in the 
vicinity of .90. 

_ 4. Selection of Tests 


There are two current tendencies in developing methods of 
vocational selection. The first strives to reproduce in minia- 
ture the work or the mental situation involved. This method 
might be illustrated by Dodge’s tests for Gun-pointers in the 
Navy. The second attempts to analyze the mental abilities 
involved in the work, and test these separately. This method 
might be typified by the Air Service tests of aptitude for 
flying. In both cases the procedure is, of course, empirical,— 
evaluating the methods upon persons of known ability before 
applying them to unknown applicants. 





1J. B. Miner. “Evaluation of a Method for Finely Graded Esti- 
mates of Abilities.” Journal of Applied Psychology, 1917, 1, 123-133. 


a 











di 








PSYCHOLOGY IN THE RUBBER INDUSTRY 5 


The latter of these two methods was used in the present 
instance. The specialized forms of work in a tire factory 
involved so many minor operations as to necessitate very com- 
plex apparatus. Several involved the handling of new rubber 
stock, which is sticky,—a condition difficult to reproduce in 
standardized form in the laboratory. Consequently various 
forms of work were analyzed into their mental and motor 
components, and a limited number of tests measuring these 
components were applied to workers of a given sort. 

The majority of the tests were given by means of mimeo- 
graphed blanks to a group of people simultaneously. Others 
involved individual examination and a certain amount of ap- 
paratus. Thirty-two tests were used in all covering a rather 
wide range. It is difficult and unnecessary to say what mental 
factor a test specifically measures, but there were in the list 
of tests selected some in which the emphasis lay upon motor 
processes, reaction time, attention, observation, learning, asso- 
ciation, memory, reasoning, space and time perception, social 
insight, ingenuity and ability to follow directions. 

The tests given on blanks were adaptations of recently 
devised methods drawn from various sources, such as the 
material left by the late Professor Miinsterberg, the work of 
the Carnegie Institute of Technology, various test blanks used 
in the Air Service and the Sanitary Corps in the army, and 
scattered periodical accounts. In most cases the material 
was worked over considerably. Anything distinctly academic 
in categories, terminology or range of information was sim- 
plified to come within the range of the average workman. 
Geographical factors were likewise adapted to local condi- 
tions and all extraneous matters eliminated as far as possible. 
Each test was given on a separate sheet with a brief example 
at the top. Standardized instructions involving the explana- 
tion of the example were used. Tests in this form were 
suitable for group examination. They were all performed 
with a time limit. They were tried first of all upon a few 
people in order to determine a proper time interval which 
would not be unnecessarily long but would still give consid- 
erable scatter in amount done by different persons and in 
order to include enough material so that no one would finish 
the test within the allotted time. The usual precautions were 
taken as to seating, lighting, sharpening of pencils and start- 
ing and stopping at signal. 

The individual tests were mostly adaptations of the con- 
ventional motor tests, described in the manuals such as Whip- 
ple’s. Various types of reaction time were also measured 








6 BURTT 


with a chronoscope. A few tests devised by the psychology 
sub-committee on Aviation of the National Research Council 
were included. A few people were given these tests as a 
preliminary step to determine the number of trials or the rate 
of performance (if done at a controlled speed). The amount 
of practice and the instruction for each performance were 
standardized. 

All the tests except those in the preliminary series described 
below, in which no careful correlations were to be obtained, 
were given in two installments. Ideally these should occur 
on separate days, but it was fairly satisfactory to go through 
the first installments of each test given to a particular person 
or group of persons in turn and then go through the second 
installments in the same order. Each installment was divided 
in two equal parts on the basis of time. This was facilitated 
in group examination by having the persons make a check 
mark at the point where they were working at the given signal. 
With these four test measures the first one of the first install- 
ment was averaged with the last one of the last installment 
and the other two likewise averaged together. This gave two 
test measures which could be correlated with the two vocational 
measures described above to correct the coefficient of correla- 
tion for attenuation. 


5. The Laboratory 


All the testing in the research stages of the work was done 
by the writer. A large well lighted room in the office build- 
ing was equipped for a laboratory, containing the apparatus 
for the individual tests and table facilities for testing several 
people simultaneously on group tests. For larger groups (ten 
or twelve) a large private office with extra tables was used. 

It is sometimes suggested that a laboratory of this sort 
should be in the factory itself, in order to obtain the work- 
men at the right time, to afford no sharp contrast with shop 
conditions and consequent nervousness, and to keep the ex- 
perimenter in touch with the shop work. In the present 
instance the use of the laboratory was perfectly satistactory. 
A list was sent the day before to the man in charge of time 
cards at the factory of the men whom it was desired to test 
on the morrow, together with the hour at which they were 
wanted and the approximate length of time they would be 
needed. There appeared to be no great shock on the part of 
the workmen in coming from the factory to the more quiet 
laboratory, for the news of the work and the nature of the 
tests spread rapidly through the personnel. Furthermore the 

















PSYCHOLOGY IN THE RUBBER INDUSTRY 7 


writer found ample contact with factory conditions in inter- 
viewing foremen and watching various men at work. 


6. Preliminary Series 


The first actual testing done at the factory consisted of a 
set of twenty group tests covering a fairly wide range of 
mental factors and occupying about two hours. The purposes 
of this series were three: first,—as indicated above, to 
familiarize those in authority, especially the operating council 
and foremen with the methods so that they would be in a 
better position to co-operate in getting workmen to take the 
tests and in providing ratings ; secondly,—to determine which 
types of tests were most promising for further study in con- 
nection with special factory operations; thirdly,—to obtain 
a rough notion of the distribution of intelligence throughout 
the factory. 

The tests were given to the operating council, other execu- 
tives in the factory, some of the clerical force and laboratory 
personnei and a sampling of workers in a considerable number 
of factory operations, including equal numbers of good and 
poor workers of each sort. After the foremen had been tested 
there was a pause while getting ratings of the workmen in 
order to select the aforementioned samplings. i 

Each test was averaged for each occupational group and 
also for the entire number of people tested. The group 
averages, and in some cases individual scores, were all tabu- 
lated as a ratio of the score in a test to the grand average for 
that test. It was found that occupational group averages in 
some tests showed a much greater difference between extremes 
than in other tests. For example, the extremes of one test 
ranged from 81 to 112 per cent, whereas in another test they 
ranged from 5 to 184 per cent. The first of these would 
obviously be of little value in differentiating groups of workers 
of the sort studied. Furthermore certain grcups were mark- 
edly superior or inferior in certain tests. These facts together 
with a detailed individual study of good versus poor workers 
in various groups, gave the starting point for subsequent work, 
indicating, for example, whether motor, attentive or memory 
processes were most involved in a given factory operation. 

All twenty tests were averaged together for each occupa- 
tional group. This average of the twenty tests, inasmuch as 
they covered a fairly wide range, gave a rough index of general 
intelligence. Presumably any combined score of a consid- 
erable range of tests gives such an index and it is merely then 
a question of determining norms. This is what is done, of 











8 BURTT 


course, in various point-scales for intelligence. Comparing the 
different groups of workers in intelligence there was found 
an occupational hierarchy. The operating council, laboratory 
force and clerical workers stood rather high in intelligence. 
The shipping clerks and general factory executives were appre- 
ciably lower. Still lower came the foremen and fairly skilled 
workers, such as finishers and builders. Below these were the 
less specialized forms of factory workers and at the bottom 
of the scale the unskilled laborers. This occupational hierarchy 
is interesting in its agreement with the similar one found in 
the army on the basis of the Alpha examination.* It would 
seem that there are different kinds of industrial performance, 
each with its minimum intelligence requirements and that a 
person tends finally to reach the highest level for which his 
intelligence qualifies him. 


7. Statistical Methods 


In subsequent series test scores were correlated with known 
vocational ability, in order to select the best tests and weight 
them properly to obtain the best possible vocational prediction. 

As soon as the ratings and the test scores were in final 
form, the correlation formula involving rank differences 
squared was used for a preliminary selection of the tests. The 
best ones were then correlated by the “ products-moments ” 
formula. There were two scores in each test and two meas- 
ures of vocational ability as above described. The necessary 
correlations of these were computed to be used in the fol- 
lowing formula: 

4 
t4i+V¥itaeeV¥esi-r Vase 


VVN;.TAT, 


where V, and V, are the two vocational ratings (cf. section 
3 supra) and T, and T, are the two test measures, (section 
4 supra) and V, T, indicates the correlation between V, and 
T,, etc. To correct the coefficients for the inter-test correla- 
tions, the simpler formula was used: 


VA\B,. A,B, 
VAA,.B.B, 


in which A, is the first measure in the test A, A, the second 
measure in test A and similarly for B, and B, and A,B, indi- 


2 The only published account of this study is an abstract of a paper 
read at the 1918 meeting of the American ae Association by 
J. W. Bridges, Psychological Bulletin, 16, 1919, p. 42. 























a 


PSYCHOLOGY IN THE RUBBER INDUSTRY 9 


cates the correlation between these two measures. This latter 
formula seemed justifiable for inter-test correlations because 
the two parts of each test correlate highly, whereas foremen’s 
ratings do not inter-correlate as highly, and consequently more 
careful correction is necessary. 

The coefficients of partial correlation were then obtained 
for the given factory operation and the regression equation 
computed. The writer usually follows the method outlined 
by Rosenow.* The practical problem is to select as few tests 
as possible while still giving a high prediction. Five tests 
may give, for example, practically as good a prediction as 
seven tests with a consequent saving of time. Rosenow’s 
method was followed of computing from a limited number of 
partial coefficients the final R of multiple correlation, 1. e., 
the correlation of combined weighted test scores with the 
criterion. It is convenient in numbering the variables to take 
as the end ones, those most apt to be dropped,—for example 
with a criterion X, and 6 tests, X, to X,, it is wise to select 
as X, and X, the tests which have lowest correlation with 
the criterion. Thus when the R of multiple correlation for 
all six tests has been found, without additional labor one can 
determine the R that will be obtained by dropping test two 
or test seven. 

The regression equation was then worked out for the vari- 
ables which were retained. A table for weighting the test 
scores was made, the original test measures weighted accord- 
ingly and the combined score correlated with the criterion as 
a check on the whole procedure. The individual combined 
weighted scores were m w averaged and their standard devia- 
tion computed in order to make from the probability integral 
a decile distribution. A ten by ten fold table was then con- 
structed from the probability integral for the given R of 
multiple correlation showing what per cent of those in the 
highest tenth in one variable were in the highest, second, third, 
etc., tenth in the other variable. Substituting for the deciles 
in one variable the deciles of combined weighted test scores 
and letting the other variable represent vocational ability, 
it was possible from the table to predict the probability of a 
person attaining a given score in the test of falling within 
any decile of vocational ability. 


8. Tests for Special Factory Operations 
It was necessary, of course, to limit the number of tests 





®’ Rosenow, C. The Analysis of Mental Functions, Psychological 
Review Monographs, No. 106, 1917 (appendix). 

















10 BURTT 


given to special operatives to those forms which had greatest 
promise. This limiting was done in two ways. In the first 
place the operations were analyzed as far as possible by 
observing men at work and talking with foremen as to the 
mental qualifications needed for the given operation. Sec- 
ondly the preliminary series (supra) in which a few good and 
poor workers of various sorts took a wide range of tests, made 
it possible to limit the number of tests to the general cate- 
gories which appeared of significance in this preliminary study. 

The first operation studied intensively was that of finishing 
tires. There was a large labor turnover in this department 
and it apparently involved rather specialized ability. The 
worker received the tire partially built on the iron core, put 
stock of lighter weight on the outside and rolled it down, 
spinning the tire by hand. He was furnished with strips of 
stock of varying widths which had to be applied to certain 
parts of the tire, in some cases following a line rather closely. 
This operation was grouped with that of treading tires because 
the two were quite similar, except that in treading, no stock 
was applied near the rim. Thirteen tests were given to fin- 
ishers of known ability,—tests involving motor ability, learn- 
ing, attention, reaction time and judgment of distance and 
velocity. One surprising result was the low correlation of 
vocational ability with most of the tests involving motor co- 
ordination such as the conventional forms of tapping, aiming 
and steadiness tests. Six of the tests correlated rather highly 
with ability at the work, but it was found that three tests 
properly weighted gave practically as good a prediction. These 
tests involved underlining adjacent pairs of numbers whose 
sum was ten in a large group of numbers, finding consecutive 
numbers scattered at random over a page, and simple visual 
reaction. These tests properly weighted gave a correlation 
with finishing ability of .61. 

The second group of operatives studied were those who 
handed out stock. The operation was fairly simple involving 
looking at the numbers on tags at the finishing benches, re- 
membering the specifications long enough to go to the place 
where the stock was “booked” (placed between sheets of 
cloth to avoid sticking), and supplying the stock to the finisher. 
This was not a difficult operation, but had a considerable turn- 
over. Ten tests were given involving memory, reasoning, 
directions, association, learning and attention. The regression 
equation for eight variables (that is seven tests) gave a corre- 
lation with ability at this work of .71. Using only three of 
these tests, a correlation of .67 could be obtained. These three 











a= 


pce sane 

















cae 


tenga ewe 








PSYCHOLOGY IN THE RUBBER INDUSTRY ll 


tests involved underlining adjacent numbers whose sum was 
ten, underlining adjacent letters which formed words in an 
unseparated mixture of letters and memory tested by the 
method of word pairs. 

The operation of tire building was somewhat like finishing. 
The iron core was placed on the machine, the stock from the 
rollers above pulled down and placed on the core, the clutch 
thrown with the foot to spin the tire and then the rollers 
applied by turning a small lever. Several plys were placed on 
each tire as well as a bead near the rim. Thirteen tests similar 
to those used with finishers were given to builders of known 
ability. The results were negative,—no correlation being 
greater than .16. There were three possible explanations: (1) 
None of the mental factors tested were involved in the build- 
ing operation ; (2) the vocational ratings were unreliable; (3) 
the men did not work with maximum effort at the tests. As 
to the first possibility it is doubtful, in view of the range of 
tests tried, that none of the factors involved in building were 
represented. As to the second point the correlation coefficients 
were corrected for attenuation, which should theoretically com- 
pensate, to a considerable extent, chance errors in the voca- 
tional estimates. The third explanation seemed the most prob- 
able. The average worker did not feel that his present job 
was permanent, but was considering possibilities of advance- 
ment or discharge. The average or good builder, however, 
received good pay and felt that his position was permanent 
and was consequently less interested in the tests (witness the 
fact that a number of builders refused to take the tests, whereas 
this seldom occurred with other operatives). The poor build- 
ers, however, might realize their ability and try hard in the 
tests. There were obvious cases where a builder appeared 
rather indifferent to the work. It was planned to check this 
factor later by testing new help who were hired for building. 
In the employment office they would doubtless work with 
maximum incentive. 

The clerical workers were made the basis of a brief study. 
No effort was made to differentiate various kinds of office 
work, but the employees were rated merely for general clerical 
ability aside from stenography. Ratings were made by various 
managers and heads of departments. Ten tests involving 
attention, memory, association, and so-called alertness were 
used. Four of these tests were retained in the regression 
equation,—underlining adjacent letters which formed a word, 
reading a text in which alternate letters were to be omitted 
in order to make sense, an arithmetical test, and an analogies 














12 BURTT 


test in the form given in the army Alpha. This combined test 
score correlated with clerical ability to the extent of .5S. 

It was necessary to group together all the remaining gen- 
eral factory operations. In all the other forms of work there 
were either too few le engaged in the same sort of work 
to make a valid basis for statistics, or the work itself was 
quite simple and obviously did not need a specialized mental 
make-up. The problem then was to devise a simple group of 
tests to differentiate good workers in moderately skilled opera 
tions in general from poor workers, and to further differen- 
tiate this class from unskilled laborers. A sampling of good 
and poor workers was taken from various parts of the factory 
including final inspection, bicycle tires, bead wrapping, wire 
department, sundries, stock rolling and booking, solid tire, etc. 
A few of the best and worst workers of each sort were selected 
by the foremen. They were given nine different tests. With 
the ratings merely in two groups, detailed statistical a 
was impossible so the method of unlike signed pai 
used. Five tests were retained, inter-correlated ob eo weighted 
by inspection. These tests were mostly of the sort generally 
characterized as intelligence tests such as memory for word 
pairs, association of opposites, absurdities, substitution and 
the like. There was a fairly close correspondence between 
combined test score and general ability in the factory. As 
nearly as could be estimated this correlation was about .50. 


9. Testing New Employees 


Most of the men passing through the employment office dur- 
ing about a month, with the exception of those who were 
obviously fitted for only unskilled labor, were tested. The 
ten by ten fold table described above was too complicated for 
use in the employment office, so a simplified form like the 
following was devised: 

“Work; Tire Finishing 

By good workers is meant the best three tenths of a large 
number of workers taken at random. 

By average workers is meant the next three tenths, i. e., the 
4th, 5th and 4th. 

By y poor workers is meant the next three tenths, i. e., the 
7th, 8th and 9th. 

By very poor workers is meant the lowest tenth. 

Of a hundred men scoring in tests: 

128 or better: 73 will be good, 21 average, 6 poor, 0 very 
poor. 




















PSYCHOLOGY IN THE RUBBER INDUSTRY 13 





103 to 127: 56 will be good, 31 average, 12 poor, 1 very 
poor, etc. 

These figures instead of showing how many out of a hun- 
dred people obtaining a given score will be good, average, etc., 
may be used equally well to predict a probability of a single 
applicant; for example, if a man scores 128 or better, the 
chances are 73 out of 100 that he will be a good worker, 21 
out of a 100 that he will be an average worker, etc.” 

It was desirable both in this blank and in personal con- 
versation to emphasize the probability aspect of the predic- 
tion in order to minimize the effect of the occasional cases 
apparently violating the general rule. Business people are 
very apt to attach great weight to a single case in which a 
person who does well in the tests does poorly in the factory, 
neglecting the fact that there is always a certain probability 
of such a thing happening. 

These tables for the various correlation coefficients or for 
the various factory operations were placed in the hands of 
the employment manager. Most of the recommendations at 
this time were made in person by the writer with these blanks 
before the manager. In a larger establishment where there 
were various parties concerned with hiring employees and 
where more occupations had been studied, it would doubtless be 
advisable to have a blank form printed for each factory opera- 
tion, enter the individual’s name on this blank, and place a 
check mark in the class into which he fell in his test scores, 
thus indicating at a glance his probable success in that work. 

In the present instance effort was made to hire men who 
fell only in the best five or six tenths in test scores. The 
per cent of rejections would depend, of course, upon the labor 
market. During the war, for instance, industries were com- 
pelled to hire practically anyone who was available regardless 
of ability or probable success. Supposing, for example, the 
lowest ten per cent of workers of a given sort constituted the 
labor turnover, if men were hired at random for this work 
the turnover would be ten per cent If test score and voca- 
tional ability correlated to the extent of .70 and enough appli- 
cants were available so that only the best half could be 
selected, the decile tables above described show that this labor 
turnover could be reduced to one per cent. With a coefficient 
of .60, the ten per cent turnover could be reduced to about 
two per cent. If the two lowest deciles represented labor 
turnover, with a coefficient of .70, this twenty per cent turn- 

















14 BURTT 


over could be reduced to about four per cent. Any other 
expectation on the basis of a given coefficient for a given 
percentage which constituted a turnover could be worked out 
from the tables. 

The test program in the case of given workers depended 
largely upon the positions vacant in the factory. At the time 
most of the new help were tested by the writer, there was a 
considerable need for tire finishers. Consequently all appli- 
cants who appeared at all promising were given the tests 
described above for finishers. If a number presented them- 
selves at once they were given together two of the tests and 
then each individual was taken separately through a measure- 
ment of visual reaction time. A stenographer provided with 
stencils for correcting the tests and tables for weighting them 
was able generally to keep up with this procedure, so that a 
few minutes after the last man had been through the reaction 
time test, vocational predictions were available for the entire 
group. Those who fell in the upper five or six tenths in the 
finishing score were immediately recommended for that work. 
Those who fell below this standard were given tests for 
handing out stock, if workers of that sort were needed at 
the time, otherwise they were given the test for general factory 
work. If they did well in this latter they were recommended 
for any general moderately skilled operation where there was 
need, but if they did poorly in this series, they were recom- 
mended for some unskilled laboring operation. If there was 
a need for clerical help, some of those who appeared more 
alert were given the clerical tests. This whole procedure would 
doubtless be different at a time when workers for some other 
operations besides finishing were badly needed. The general 
system with a large number of regression equations available 
would be to give first the tests for those occupations where 
there was the greatest need. 

It occasionally happened that the recommendations on the 
basis of the tests ran counter to an individual’s interest. A 
man naturally succeeds better at something in which he is. 
interested. If, however, he preferred a certain type of work 
for which he was mentally or physically unfit, it was an eco- 
nomic loss both to himself and to the company to engage him 
for that work. When such a situation was found, effort was 
made to dissuade the man from the work which he wished to 
do. Failure in the work would doubtless cause a loss. of 
interest, and the plodder who learns after a long time with 














PSYCHOLOGY IN THE RUBBER INDUSTRY 15 


economic loss to himself and to the company, is an exception. 
In many cases where the man appeared rather intelligent, the 
method of developing the tests and his probable expectation 
on the basis of his results were explained to him, so that he 
understood that his recommendation for some other occupation 
was due to the fact that he stood a poor chance of success in 
the one he preferred. 

As the writer did not intend to remain with the factory 
permanently, a man was trained to carry on the work of testing 
applicants in the employment office. He first spent some 
time watching the writer give the tests and studied the in- 
structions carefully. He then gave the tests himself under 
the supervision of the writer and all undesirable reactions 
were corrected. He was provided with a manual and tables 
for weighting the tests so that it was possible for him to 
obtain the weighted test scores with perfectly standardized 
method. 

Effort was made during this month to follow up the men 
hired on the basis of test scores. Few of the men reached 
the piece-work stage during this period, but record was ob- 
tained of their daily production and in many cases estimates 
obtained from foremen as to their probable success. Had 
the writer remained on the ground longer, these estimates, 
of course, would have been much more systematized. Prac- 
tically all the men who scored above the average in the tests 
and were hired on that basis were doing successful work when 
the writer left. A number of men with low test scores were 
hired as a check on the method. All of these gave up the work 
in the course of a short time, indicating that lack of success 
seemed to produce lack of interest. ‘ 

The work was running smoothly when the writer left. Shortly 
after that time, there was a change in the executive personnel 
of the company,—the president and general technical super- 
intendent, who had been the main sponsors of this work, being 
among those who resigned. At the present writing the work 
has been temporarily dropped, but it is hoped that it will later 
be resumed because at last accounts it was definitely vindi- 
cating itself. 


10. Conclusions 


_ The foregoing account of research at a rubber tire factory 
illustrates one method of psychological approach to the prob- 
lems of vocational selection in industry. The general pro- 














16 BURTT 


cedure was the standardization of mental and motor tests upon 
workmen of known vocational ability in order to use these 
standards in assigning applicants at the employment office to 
the type of work in which they stood the greatest chances of 
success. The operations at the factory seemed to fall into 
three large classes: those requiring no special mental ability, 
such as shoveling and trucking; those involving the learning 
of a few relatively simple operations and requiring a modicum 
of general intelligence; and those involving more definitely 
specialized mental or motor abilities. The last of these pre- 
sented by far the largest and most difficult problem. 

Effort was made in all cases to obtain the criterion, i. e., 
the vocational or production rating before tests were given 
at all. Ratings made by a given foreman were transformed 
into terms of the standard deviation of his ratings. They 
could then validly be averaged with estimates made by another 
foreman. A wide range of tests, both group and individual, 
were used. Certain of these were selected for each operation 
studied on the basis of tests of a preliminary sampling of 
good and poor workers of the sort under investigation. This 
selected set of tests was given to workers of all degrees of 
ability at a certain operation and each test score correlated 
with the vocational rating. The most promising tests were 
retained, the coefficients corrected for attenuation, partial 
correlations computed and the regression equation derived in 
order to weight the tests and get the best possible prediction 
of vocational ability. After determining the correlation of 
combined weighted test scores with the criterion a table was 
made with deciles of combined weighted test scores tabulated 
for deciles of probable vocational success. It was thus possible 
to predict from the test scores the probability of an applicant’s 
being in the highest tenth, next highest tenth, etc. of workers 
in a given operation. These methods were applied to a 
number of rather specialized operations,—finishing, treading 
and building tires, handing out stock and clerical work. With 
the exception of building final correlations between tests and 
criterion of between .56 and .71 were obtained, thus giving 
considerable reliability to the predictions made on the basis of 
the tests. For the less specialized factory operations a general 
intelligence scale was devised to differentiate good from poor 
workers in the average moderately skilled operation and to 
further differentiate these workers from those fitted definitely 
for only unskilled labor. 

These test scales were given to new employees and recom- 
mendations made on the basis of the decile distributions. 














PSYCHOLOGY IN THE RUBBER INDUSTRY 17 


Effort was made to hire a person for a type of work in which 
he fell in the upper five or six tenths of a normal distribution 
of combined weighted test scores. If unsatisfactory in the 
tests for one operation he was tried in those for another 
until he finally made a good showing or else demonstrated 
his fitness for only unskilled labor. If recommendations ran 
counter to an individual’s interests his chances of failure in 
the desired work were explained and an effort made to interest 
him in the recommended work. A man was trained to give 
the tests and continue the work after the writer left the fac- 
tory. The progress in the factory of new help hired on the 
basis of the tests was, at last accounts, vindicating the methods. 











WHAT INDUSTRY WANTS AND DOES NOT WANT 
FROM THE PSYCHOLOGIST' 


By Extorr Frost, Rochester, N. Y. 


Our topic is predicated upon the proposition that Psychology 
is both willing and eager to place at the disposal of industry— 
as it has already done for medicine, education, and the several 
emergencies of the War—its methods and its conclusions. 

There was a time—sufficiently recent for these Emersonian 
walls to have heard and remembered—when to stress the prac- 
tical application of our science, constituted a gaucherie frowned 
upon in some of the highest circles. If recent literature re- 
flects inner conviction, that day is happily past, and consola- 
tion may be given the mourners by those of us for whom such 
a time never existed—for nothing good has really died. Our 
plight was due, first, to the youthfulness of our science, which 
felt it proper enough to be seen and not heard—but also to 
certain standards of scholarship professing to find in a liaison 
of theory with practice the liability of offspring for whom the 
assumption of paternity might prove an embarrassment. Medi- 
cine does not suffer as a science because men are healed 
thereby, nor need psychology regret finding itself similarly at 
the service of mankind. Such service as she can render will 
occasion fresh demands for service, and the incentive to search 
and research will grow with the years. It is because I assume, 
then and therefore, the response of the psychologist that I dare 
to speak for industry in terms so categorical. 

Industry is worthy of such help as she needs. The needs 
of industry to-day are unprecedentedly great. These—I take 
it—are two self-evident propositions A part of the dilemma 
of industry, however, is or shortly will become, the multiplicity 
of her doctors and the variety of their pottles. Let me cite 
for you a few examples from the row of industrial panacea. 
Each bottle is labeled by those who dispense it, as a “ General 
Specific for Industrial Diseases.” 

You will not be misled by such a recital into the thought 
that these measures are worthless. On the contrary, I shall 





1 Read at the Annual Meeting of the American Psychological Asso- 
ciation, Cambridge, Mass., Dec. 31, 1919. 


18 

















WHAT INDUSTRY WANTS FROM THE PSYCHOLOGIST 19 


mention some which I believe to be important, not only, but 
indispensable. The list follows: 

Americanization of the Alien 

Establishment of Bonus Systems 

Better Housing of Employees 

Universal Continuation Schools 

Increasing Efficiency Through Diet 

Daylight Saving Laws 

Efficiency Engineering in Factories 

The Establishment of Foremen’s Classes 

Find Yourself Campaign for Working Boys 

Graded Wage Scales 

Compulsory Health Insurance 

Industrial Democracy Systems 

Special Legislative Enactments 

Morale Work 

Diagnostic Motion Pictures in the Shop 

Nationalization of Basic Industries 

National Board of Community Speakers 

Psychological Tests 

Questionnaires 

Personnel Ratings 

Systematic Propaganda 

Recreational Athletics 

Rest Periods and Fatigue Studies 

Restriction of Immigration 

Shop Committees 

Employment Management 

Socialism 

Revision of the Metric System 

Excess profits and other Taxation 

Thrift Campaigns 

Unionism 

Vestibule Schools 

Vocational Guidance 

Welfare Work 

This list could be doubled, nor is it a fictitious one. Each 
separate proposition is brought forward by earnest and honest 
men. Each one represents an actual proposal to the writer 
by some man, or group of men, within the last three months. 
Now, it is not the lack of excellence in these proposals, but 
the confusing number of them that constitutes a handicap. To 
be sure, many of them demand a leadership by unusual person- 
alities or extensive and elaborate technique, or both. Even 




















20 FROST 


this is, however, the minor difficulty. The tendency to-day, 
despite the gravity of the problems faced, is to reject all 
panacea from a sheer inability to analyze and choose from 
among them. Industry is somewhat in the position of a man 
overboard, liable to perish from the bombardment of life- 
preservers thrown to rescue him. 

Making money is, and broadly speaking must be, the ex- 
pectation of business. The manufacturer must buy, manu- 
facture and sell to show a profit or cease to do business. If 
the psychologist is to help, he must help make money. He 
must, in short, be both specific and practical. 

Eight or ten years ago, industry was especially concerned 
with Workmen’s Compensation Laws and with methods of 
Industrial Safety. Both these research problems are now 
solved with reasonable adequacy. On the other hand, the 
unrest following War has emphasized a new set of problems, 
the major of which are: 

(a) Unionism, (b) Labor Turnover, (c) Selection and 
Training of Foremen, (d) Education of the Alien, (e) Wage 
and Hour Adjustments, (f) Housing, (g) Working Condi- 
tions, and in some States (h) Compulsory Health Insurance, 
(i) Taxation, (j) Continuation Schools. 

Four of these have special psychological implications : 

First, Labor Turnover—By Labor Turnover, we mean, 
roughly, the number of men hired in a given industry, per 
year, in proportion to the number of jobs. This varies all the 
way from zero to 8000%. 100% is a small, rather than a large, 
average. That means that the employer having, say a hun- 
dred workers during the twelve months, hires a hundred addi- 
tional workers to fill vacated positions—some of these posi- 
tions, of course, being emptied and filled many times over, 
while others are not disturbed. The annual economic loss in 
this country from Labor Turnover is estimated at one and 
one-half billion dollars, or $50 per head. Practically 50% 
of it is due to lay-offs or discharge. The discharge is usually 
for one of four reasons: Inadaptability, Unwillingness to 
Work, Wrong Attitudes Toward Work, or Positive Miscon- 
duct. The other 50% is due to dissatisfaction. Where turn- 
over is high, dissatisfaction is the chief factor in the turnover. 
It is significant, on the other hand, that when the learning 
period in any job is long, turnover tends to be low; and, in 
general, of course, is lower in the skilled trades. 

Psychology can help the employment manager in his hiring 
and firing and making of transfers within the plant, if it can 
devise a simple—preferably oral—readily applied test, to de- 

















WHAT INDUSTRY WANTS FROM THE PSYCHOLOGIST 21 


termine three things—(a) Intelligence, (b) Adaptability to 
Particular Tasks, and (c) Temperament. 

In this connection, the well-disposed psychologist, however, 
must remember two hard cold facts: First, That labor in any 
community is usually either very abundant or abnormally 
scarce. If abundant, rough empirical tests, with recommenda- 
tions do very well. If scarce, the man will be taken on any- 
how whatever his mental rating. Second, We need to remem- 
ber that vocational training does not place a man. The Law 
of Supply and Demand, and not his ambitions or aptitudes, 
determines the career of most men in industry. 

A second timely problem, in which the psychologist has 
proper concern, is that of Americanization. This involves the 
education of the illiterate and of the foreign-born, both in Eng- 
' lish and in American ideals. 

An analysis of racial psychology and the application of par- 
ticular educational methods, devised to meet the educational 
problems of the several races, has not yet been made, so far 
as the speaker is aware. The failure of some of the most 
ambitious attempts to minister to industrial needs during the 
last four years, has been due to the failure properly to edu- 
cate. At one plant, for instance, of 800 employees, in spite 

a of an elaborate and particularly successful shop committee 
fF system, an analysis of the workers showed that 500 of the 800 
were either foreigners or illiterates, and had no conception of 
the machinery set up in their behalf. They, therefore, went 
out on strike. 

If the Swede, the Pole, the Italian, the Greek, are to be 
taught the advantages of industrial democracy, the lesson must 
be built upon the background of a proper racial psychology. 

There is, for instance, more than a mild tendency to racial 
monopoly in certain industries. The French Canadians drift 
toward the cotton factories, copper mining, smelting, boots 
and shoes—the Croatians toward the mines, steel and the 
filthy trades—the Danes toward leather, furniture, collars and 
cuffs—the Dutch toward silk-making, dyeing, furniture fac- 
tories, and the like—the Armenians toward cigarette making 
and peddling—Greeks into the blacksmithing, baking, shoe- 
making, boot-blacking and the like—Hebrews into the small 
manufacturing of the sweat-shop variety—Chinese into im- 
porting—French-Swiss into the hotel and restaurant business, 
silk industries and embroideries, and so on. 

Again, certain races are easily assimilated. For instance, 
the French-Swiss, Ukrainians, Russians, Poles and Japanese. 
Others are distinctly clannish, as the Slovacs, Armenians, Al- 





— 














22 FROST 


banians, Portugese and Chinese. Such social characteristics 
of the various nationalities are clearly reflected, both in their 
work, and in their relation to the industry. 

The laborer’s attitude toward industrial relations is deter- 
mined by his nationality more than any other single factor. 
The Jew, for instance, demands an arrangement in which he 
can bargain. He is continually thinking of how much he is 
receiving for his labor. He is really more conscious of his 
_labor organization and its methods than he is of his creative 
labor faculties. As a thinker, he is usually of the radical 
stamp. The Italian’s highly emotional nature lends itself 
readily to directions by the organizers. It is the testimony 
of the executives that he cannot be trusted without reserva- 
tions, and that he is apt to be sullen and moody. The German 
workman is of placid disposition, loves detail, is particularly 
effective on precision work. The Pole and Croat usually do 
the dirty work in the plant. For instance, in the grinding of 
lenses, a job in which the workman becomes covered with 
damp red pumice, the Poles predominate. 

In any Americanization program then, educational methods 
must vary with nationalities involved. The ideal method of 
reaching the slow-moving, generally placid Swede will not 
serve for the emotional Italian; nor that adapted to the keen- 
witted Jew for the average German mechanic in the shop. 

These facts and tendencies indicate a distinct problem in 
race psychology, but one applicable in its solution to the work- 
man at the bench. Racial characteristics affect quality and 
quantity of work. They affect adaptability, willingness to 
work, attitude toward work, and misconduct—the four basis 
criteria of discharge. 

In short, an industrial psychology has still to be written, 
although the need for it has long been great. 

In connection with the Continuation or Part-Time School, 
psychology, and especially educational psychology, can render 
a third service to industry. Such a school, inaugurated some 
18 years ago in Chicago, is now permissive in Massachusetts, 
Ohio, Michigan and several other States, and compulsory 
in Wisconsin, Pennsylvania and New York. It provides for 
the continued schooling, on the employer’s time, either in the 
arts and sciences, in civics, or in vocational work, from 4 to 
8 hours a week, of all youth in industry, and who have no 
education beyond the grade school. 

If this Law is not to operate as a handicap upon the em- 
ployment of such workers, it must justify itself by the increase 


of efficiency through a sound educational program. This: 














abe Te 











WHAT INDUSTRY WANTS FROM THE PSYCHOLOGIST 23 


efficiency will be measured by the employer—whether we like 
to face this fact or not—in terms of dollars and cents. Can, 
then, the psychologist offer a suggestion as to the best method 
of instruction, the best time of day within the factory schedule, 
the wise allocation of hours during the week? Can he advise 
as to whether such instruction should be by racial groups, or 
to a mixed group of all races? Can he hope to show the 
employer a relation between broad education and production 
units ? 

The fourth and perhaps most important service renderable 
to industry by the psychologist, concerns the foreman. The 
War Labor Board claims that 98% of industrial disputes faced 
during the War, led back to the foreman. The foreman is the 
Army Top Sergeant. The passed buck reaches him. He 
must handle men mentally and physically. Until recently, the 
emphasis was upon the physical. Under newer ideas and 
ideals of persuasion, therefore, industry now finds itself sad- 
dled with men, whom it has taught to be “ hard-boiled.” These 
petty officers, either in industry or War, can make or break 
morale. Foremen, therefore, and prospective foremen, must 
be selected and trained for their jobs. The attitude between 
the foreman and the man at the bench, depends in part, upon 
the power the foreman holds. If he can hire and fire at will, 
methods of persuasion are less likely to be used. Two modern 
industrial tendencies are helping in the solution of this problem: 

First, The centralization of employment, whereby the hiring 
and firing is taken out of the foreman’s hands; and 

Second, The vestibule-school idea wherein the foreman be- 
comes a teacher to the man at the bench. 

These are both steps in the right direction. While the 
major task of the true foreman is that of problem-solving, 
he will always remain a manipulator of human material, and 
the product which leaves his room will continue to reflect his 
success or failure in dealing with psychological factors. At 
present, the average foreman is rarely a teacher, but he is a 
man, usually, who can DO. Tests devised by the psychologist 
to aid in the selection of foremen should, then, have in mind 
three qualifications: 

First, Ability to solve problems. 

Second, Ability to handle men. 

Third, Ability to teach the theory of that which one knows 
in practice. 

Industry does not want a booklet of psychological tests. 
The advertised success of the latter in the Army Camps has 
not sold the idea to the manufacturer. He must be shown the 














24 FROST 


value of any technique by patient education and proved results. 
The psychologist must not become “cocky.” He is already 
accused of it. His proposed solutions must check with prac- 
tical experience. He must know concrete conditions; must 
devise his remedial laboratory experiments in the light of them, 
and not sui generis. No time ever needed theories less than 
ours, and no time ever needed facts more. 

If we are to remove the dust from Industrialism and give 
the day’s work its ancient sparkle, the modern tendency to- 
ward de-personalization and mechanization must be checked. 
When this tendency succeeds, incentive fails—the adventure 
of creative work departs—business captures industry for 
profit—ideas become institutionalized, and the workman a 
machine. 

More than tests, analyses, teaching of aliens, or training of 
foremen, the psychologist is helping industry when, knowing 
its problems, he brings to their solution a new vision of the 
importance of psychology itself. 















































tee ol 








A CONSTANT ERROR IN PSYCHOLOGICAL 
RATINGS 


By Epwarp L. TuHornoixe, Teachers College, Columbia University 


In a study made in 1915 of employees of two large indus- 
trial corporations, it appeared that the estimates of the same 
man in a number of different traits such as intelligence, indus- 
try, technical skill, reliability, etc., etc., were very highly 
correlated and very evenly correlated. It consequently ap- 
peared probable that those giving the ratings were unable to 
analyze out these different aspects of the person’s nature 
and achievement and rate each in independence of the others. 
Their ratings were apparently affected by a marked tendency 
to think of the person in general as rather good or rather 
inferior and to color the judgments of the qualities by this 
general feeling. This same constant error toward suffusing 
ratings of special features with a halo belonging to the indi- 
vidual as a whole appeared in the ratings of officers made by 
their superiors in the army. 

The official rating plan devised by Walter Dill Scott called 
for separate ratings for Physical Qualities, Intelligence, Lead- 
ership and Personal Qualities (i. e. Character). The instruc- 
tions very emphatically required each of these four to be esti- 
mated independently of the others, as appears from the direc- 
tions quoted below. Yet the correlations of the Intelligence 
rating with the ratings for Physique, Leadership and Character 
made by a very conscientious officer in the case of 137 avia- 
tion cadets whose work he, as flight commander, supervised, 
were .51, .58 and .64 respectively. These are all higher than 
reality, plus the attenuation due to erroneous judgments, could 
well give, especially within the restricted range of the com- 
missioned-officer group. They are also too much alike. In 
reality Intelligence and Character or Intelligence and Leader- 
ship should give about three times as close a correlation as 
Intelligence and Physique. 


“ How to Make the Scale. 
3. Make a list of about a dozen officers of your own 
rank and not above the average age of officers of this 
25 














26 





THORNDIKE 


rank. They should be men with whom you have served 
or with whom you are well acquainted. Include officers 
whose qualifications are poor or mediocre as well as those 
who are highly efficient. This list serves merely as a con- 
venient reservoir of names; the names actually used in 
the scale may include others. 

4. Look over your list from the viewpoint of Physical 
Qualities only. Disregard every characteristic of each 
officer except the way in which he impresses his men by 
his physique, bearing, neatness, voice, energy and endur- 
ance. Select that officer who surpasses all the others 
in this qualification and enter his name on the line marked 
highest under Physical Qualities. Now select the one who 
most ny lacks these qualities and enter his 
name on the line marked lowest. Select the officer who 
seems about half way between the two previously selected 
and who represents about the general average in physical 
qualities; enter his name on the line marked middle. 
Select the officer who is half way between the middle and 
the highest; enter his name on the line marked high. 
Select the one who ranks half way between middle and 
lowest; enter his name on the line marked low. 

5. In the same manner make out scales for each of the 
other four sections (Intelligence, Leadership, Personal 
Qualities and General Value to the Service). 

How to Use the Scale. 

6. Rate your subordinate for Physical Qualities first. 
Consider how he impresses his men by his physique, bear- 
ing, neatness, voice, energy and endurance. Compare him 
with each of the five officers in section I of the Rating 
Scale, and give him the number of points following the 
name of the officer he most nearly equals. If he falls 
between two officers in the Scale give him a number ac- 
cordingly (e. g. if between Low and Middle give him 7, 
7% or 8). 

7. Rate the subordinate in a corresponding manner for 
each of the other four essential qualifications. Under III 
(Leadership) and V (General Value to the Service) con- 
sider which officer he will most nearly equal after equiva- 
lent experience. 

* * * * * * * * * 
Points for Special Attention. 

9. In making or using any section of the scale, con- 
sider only the qualification it covers, totally disregarding 
all the others. 





























ERROR IN PSYCHOLOGICAL RATINGS 27 


I, Physical Qualities. 

Physique, bearing, neatness, voice, energy and endur- 
ance. (Consider how he impresses his men in the above 
respects. ) 

II. Intelligence. 

Accuracy, ease in learning, ability to grasp quickly the 
point of view of commanding officer, to issue clear and 
intelligent orders, to estimate a new situation, and to 
arrive at a sensible decision in a crisis. 

III. Leadership. 

Initiative, force, self reliance, decisiveness, tact, ability 
to inspire men and to command their obedience, loyalty 
and co-operation. 

IV. Personal Qualities. 

Industry, dependability, loyalty, readiness to shoulder 
responsibility for his own acts, freedom from conceit and 
selfishness, readiness and ability to co-operate. 

V. General Value to the Service. 
His professional knowledge, skill and experience; suc- 


cess as an administrator and instructor; ability to get 
results.” 


The same effect appears in the ratings given by other offi- 
cers. The correlations are too high and too even. For 
example, for the three raters next studied the average corre- 
lation for physique with intelligence is .31; for physique with 
leadership, .39; and for physique with character, .28. 

The same constant error appears in the correlation of the 
total Scott rating with a rating for technical ability as a flyer 
in the case of aviation officers. It is known from abundant 
evidence that technical ability as a flyer is a rather highly 
specialized quality." Considering the restricted range of the 
aviation cadets, the correlation between general ability for 
officer work and technical ability as a flyer could hardly be 
above .40, without any attenuation. As attenuated by the 
imperfections of the rater’s knowledge of both, it could hardly 
be above .25. Yet the correlations for the eight raters studied 
in this respect are .74, .85, .52, 91, .63, .72, .47 and .53, an 
average of .67. Obviously a halo of general merit is extended 
to influence the rating for the special ability, or vice versa. 

Mr. Knight of Teachers College has studied this same 
effect in the case of 129 teachers rated by their superior officer 





1See “The Selection of Military Aviators: Mental and Moral 
Qualities,” U. S. Air Service, June 1919. 








28 THORNDIKE 


for certain qualities on the Boyce score card. The ratings in 
question were official and were used to determine salaries and 
promotions. General merit as a teacher has correlations of 
68 with intellect, .79 with power in discipline, and .63 with 
voice. It is clear that the rating of a teacher’s voice must 
have been influenced by the general impression of her ability. 
Voice correlates .50 with “Interest in Community Affairs,” 
and .63 with intelligence ! 

The correlations reported in the original study by Boyce 
show this same effect. General merit as a teacher is reported 
to correlate as follows: 


Wee PE WEIN, ng once co cosabenccauececceeney 47 
‘ SE Sai dalla cds edhca capes sccceyeneresetehasst 56 
Ps SNEED Ge bcs 00 0 Coes uh eh abaeeaedand te Sbewtn ens sé 53 
7. CI did ds ickn dei MAE nets deehnes ddéeantent 62 
“ Initiative and self reliance ..............cccceeees 77 
“ Adaptability and resourcefulness .................. 80 
" IEE SIN We 35000060 e CdSe cainsad ends evebtiewe 74 
© ). SETTERS «46 apa bane be doles and eaane ts cheaneusies 69 
S a sedis Abs sdb Guth edecths fauebnxedstines a. 
o Fr ner ee | ey rt ee 
Gist ii(‘ét RN ST EE PTY eT FO 66 
“a actin cong 1 SO REED RE en 66 
7. a cok hhte wes oakbonbbuwk Obes bbb 0b shee debeds 69 
OY: Fs is 6 x idick nvediline’ be. 02 savéeteees 61 
Ot EE EID abs 'e ip badic cl oces soutivuccede 41 
© RE I 6 o didnd obo dnve cdlebdS c dace 38 

etc. etc. 


(The last is the lowest of the forty-five correlations reported.) 


In the cases so far the correlations are a resultant of (1) 
the real facts, (2) the constant error of the “halo,” as we 
may call it, and (3) the reverse error of attenuation due to 
chance inaccuracies in the ratings. In certain further work 
by Mr. Knight the correlations are freed from the last influ- 
ence, by being based on the composite rating of two groups, 
each of a number of teachers who knew the individuals to 
be rated fairly well. The self-correlations of the ratings by 
one such a group with ratings for the same trait by the other 
group are over .90. The correlations for general ability as 
a teacher with intellect and with ability to discipline are about 
.95 and .80! The correlation of intelligence and ability to dis- 
cipline is about .80! The correlations of a standard test of 
intelligence with general ability as a teacher and with ability 
to discipline are, for the individuals in question, not over .3. 

The writer has become convinced that even a very capable 
foreman, employer, teacher, or department head is unable to 
treat an individual as a compound of separate qualities and 











en 


ERROR IN PSYCHOLOGICAL RATINGS 29 


to assign a magnitude to each of these in independence of 
the others. The magnitude of the constant error of the halo, 
as we have called it, also seems surprisingly large, though we 
lack objective criteria by which to determine its exact size. As 
a consequence science seems to demand that, in all work on 
ratings for qualities the observer should report the evidence, 
not a rating, and the rating should be given on the evidence to 
each quality separately without knowledge of the evidence 
concerning any other quality in the same individual. 




















PSYCHOLOGICAL TESTS AS DIAGNOSTIC OF VOCA- 
TIONAL APTITUDES IN COLLEGE WOMEN 


By Erste Murray, Sweet Briar College, Virginia 


The use of tests in the diagnosis of the academic difficulties 
of college freshmen, or in the sifting of the qualifications of 
applicants for admission, is now sufficiently familiar. Their 
application to the differential study of members of the gradu- 
ating classes and of their fitness for various fields of activity 
is less widespread. Under the stress of war-time conditions, 
and the resultant necessity of recommending women graduates 
to a wide variety of positions, the following attempt to extend 
and adapt the technique of psychological testing to the latter 
problem was made by the writer. 

Tests Utilized. Four Series. A series of tests already 
under way in the junior and senior classes of a woman’s col- 
lege (numbering 36 and 32 respectively), in illustration prin- 
cipally of certain routine topics in the courses in general, 
experimental, social, and educational psychology, was further 
extended. The tests were then divided into four sets accord- 
ing as they were judged roughly from past experience to be 
suitable measures of the ability to handle ideas, symbols, things 
or people (captions borrowed with certain alterations from 
Thorndike). Each set, after considerable shifting, contained 
(with the exception of the fourth) ten distinct tests, repre- 
senting roughly two hours actual classroom time each, dis- 
tributed over a period of about a year and a half. The tests 
utilized were for the most part standard ones, supplemented 
by a few original with the author. The group method was 
usually employed, though a few were individually admin- 
istered. The first series contained tests in controlled asso- 
ciation, logical memory, completion, proverb matching, gen- 
eralisation, information, etc.; and will be referred to as the 
Reasoning or R series. The second was made up of rote 
memory, rote learning, addition, spelling, cancellation, easy 
directions, and related tests, and will be known as the Accuracy 
or A series. The third, or “ Practical Ability” or P series, 

30 











PSYCHOLOGICAL TESTS IN COLLEGE WOMEN 31 


contained tests of motor dexterity and alertness to surround- 
ings, such as tapping, card sorting, aussage, construction puz- 
zle, weight suggestion, divided attention, etc. The fourth, or 
“Social Ability” set, comprises a series involving judgments 
upon personal attributes, situations, and facial expression of 
the emotions, referred to as S series. 

Evaluation of the Scores. Pooling. The results were evalu- 
ated as follows: Each test was scored on a point system 
(empirically determined and somewhat approximating a per- 
centile ranking) for both speed and accuracy, the maximal 
credits assigned to any test being usually ten points. The 
scores within each set were then pooled, and each student 
assigned a rank (1 to 36 or 1 to 32) on the basis of each of 
the four combined scores (R, A, P and S). While a small 
number rank high on all four ratings, this procedure was suc- 
cessful in differentiating each class roughly into four groups: 
one, of those doing better in R than in any other series; 
another, in A; another in P, etc. The correlations' between 


ies 
s=l— aD 


the scores for the four test series range from +.05 for A and S 
to +.68 for R and S. 

Interpretation of the Resultant Ratings. The significance 
of the differentiation of individuals thus secured was meas- 
ured roughly by comparison of test ratings with classmates’, 
instructor’s, and self estimates of I (intelligence), A, P and 
S abilities, and with academic grades. The classmates’ esti- 
mates were obtained by having each subject arrange the names 
of all the members of her class in order (1 to 36), according 
to the degree of the ability in question which she believed each 
to possess. The median position of each individual for each 
trait was then calculated, and a composite ranking representing 
the group judgment obtained for general intelligence, accuracy, 
practical and social ability, respectively, for both classes; an 
additional estimate on teaching ability for 1918. This method 
(of relative position) was found to yield higher reliability 
coefficients (ranging from .82 for S to .97 for I) than were 
secured by the seven or nine grade system. 

1. The correlations of test rankings with the corresponding 
estimates were as follows, for the R, A, P and S abilities: for 
1918, .63, .30, .12, and .19 respectively; for 1919 (with im- 





1All correlations were figured by the Spearman ‘Rank-Differences’ 
formula. 














32 MURRAY 


proved methods of giving and scoring) .73, .36, .53, .26. The 
corresponding correlations for tests and instructor’s estimates 
(the writer’s was the only one available)? are, for 1918, 84, 
79, .50, 44; for 1919, .83, .71, .59, and .65. While the co- 
efficient for reasoning tests and intelligence estimates (both 
classmates’ and instructor’s) is high (averaging .68 and .835 
for the two classes),* as compared with similar measures 
quoted by others (see Hollingworth, for classmates, .62; Kit- 
son, Ruml, Thurstone, Abelson, and Webb, for instructors, 
average .57, .66, .60, .58, and .545, respectively), the cor- 
relations for the A, P and S abilities are less striking. The 
latter are, however, for the most part higher with the improved 
methods of 1919; higher also for instructor’s than for class- 
mates’ estimates. 

The real significance of these figures (slight, it must be 
admitted, in certain cases), the possible weaknesses of both 
tests and estimates, as well as certain differences between 
faculty and student estimates, are best brought out by the 
complete tables of correlations and cross-correlations. Cer- 
tain inherent shortcomings of either P tests or estimates, for 
instance, would seem to be indicated by the fact that the P 
estimates of classmates and instructor correlate as well or 
somewhat better with A than with P test scores. On the 
other hand, the closer correlation of classmates’ (not instruc- 
tor’s) A estimates with P than with A test scores is probably 
to be attributed merely to a slip in the instructions forming 
the basis of the A estimates in 1919, in which manual rather 
than clerical accuracy was emphasized. Finally, the higher 
correlation of classmates’ estimates of S ability with R than 
with S test scores challenges explanation: while undoubted] 
due in part to an inadequate differentiation of the R and 4 
test series, it is in part also to be attributed to the difficulty 
of securing reliable estimates of the quality aimed at, the 
students tending in many cases to substitute a judgment based 
upon superficial amiability of manner, for the actual esti- 
mate of tact and sympathetic understanding of human nature 
which was sought. 


2 The correlation coefficients between instructor’s and classmates’ 
catlenptee for I, A, P and S average .77, .61, .28, and .35 for 1918 and 


SThe exact degree of correlation involved may be clarified by the 
following statement: the coefficients of .63 and .73 for class I estimates 
and R tests mean respectively that, in 1918 and 1919, 72 and 81% of 
those scoring above the median in the tests, ranked also above the 
median in the class estimates. 








4 OT ne ne ee al —~— " 














PSYCHOLOGICAL TEST’S IN COLLEGE WOMEN 33 


CORRELATION TABLES FOR CLASS OF 1919 


CLASSMATES’ ESTIMATES INSTRUCTOR’S ESTIMATES 
7s 4 6k: FS = er we Te 3 
R 7° 19 52 A2 83 51 49 .57 .80 
A 45 .36 .56 .05 A2 .71 .74 .04 

P SS 43 63 we 55 58 .59 .38 

S ©®-2B i 26 66 18 .26 .65 


*With the one exception indicated, these are all positive. 


2. The correlation of test ratings and avademic standing 
was next attempted, bringing to light an apparent variation 
in the reliability of grades from year to year, or class to class; 
further, the relative unreliability of freshman as compared 
with upper class marks. The correlation, for instance, of 
academic standing with R tests drops from .71 in 1918 to .46 
in 1919; with instructor’s estimate of intelligence, from .80 
to .68; with classmates’ from .86 to .58; with instructor’s 
estimate of teaching ability, from .81 to .55 (and this at the 
same time that the coefficients for classmates’ estimates of 
intelligence and R tests rise from .63 in 1918 to .73 in 1919; 
for instructor’s estimate of teaching ability with R tests from 
.70 to .80, with combined R A P S scores from .75 to 82). 

This collapse of the correlation coefficients for academic 
standing in 1919 was traced to two sources. First, to the fact 
that for this class freshman and sophomore grades only (in 
required courses) were utilized, while in 1918 junior grades 
had been included; elimination of the latter in calculating 
1918’s standing reduces the correlation with R tests from 
71 to .60, with intelligence estimates (classmates’), from 
.86 to .77. More particularly, the drop in the 1919 coefficients 
appears to be correlated with the presence in the class of 
two members of considerable natural ability, but of indif- 
ferent academic ambitions, especially in the earlier years of 
their college course; elimination of whose scores raises all 
of the 1919 grade correlations. In the opinion of the writer, 
the facts warrant the assumption of the superior value, year 
in and year out, of R test ratings over academic grades, as 
measures at least of potential ability. 

Corroboration of our hypothesis as to the superior reliability 
of grades from the later college years is apparently furnished 
by the corresponding correlations (of R tests and grades) 
reported by others workers; e. g., .20 for freshmen, .44 for a 
mixed group of freshmen, juniors, and seniors (Kitson) ; .42 
for juniors, .57 for seniors (Hollingworth )—note the ascend- 














34 MURRAY 


ing series. The possible explanation, that mechanical memory 
work is the determining factor in freshman and sophomore 
grades, as against initiative and reasoning ability in the upper 
classes, is not, however, supported by the relatively low cor- 
relation of grades and A test scores for 1919. Freshman 
instability and irresponsibility, or variations in the standards 
and personnel of the instructing force are presumably the 
principal sources of the discrepancies and inconstancies noted. 

Scrutiny of the table of correlations for test scores and 
grades in particular subjects brings to light other facts of 
interest. For 1919 (the test scores for which are regarded 
as more reliable), the study correlating most closely with R 
test scores is English (-+-.74); this subject, moreover, gives 
decidedly lower correlations (+.31 to .36) with all other 
test sets. The study yielding the highest correlation with A 
test scores is Mathematics; the figure, to be sure, is low 
(+.34) and is followed closely by the Latin and English co- 
efficients. However, the correlation of Mathematics with all 
other test groups is lower (+.02 to —.19). While the sub- 
ject correlating most nearly with the P tests is English 
(+-.34), Science is a close second (+.31, P. E. =.105) ; more- 
over, the correlation of Science with all the other test sets is 
lower (+.26 to —.01). Lastly, the subject correlating the 
most closely with S scores is History (+.64), and the corre- 
lations of the three other sets of scores with this subject are 
all lower (.57 to .10). 


CORRELATION OF TEST SCORES AND COLLEGE SCORES 
Tests Latin Mathe- Science English History Modern 


matics Language 

R 1918 67 .29 39 70 60 

1919 31 Ol 21 a 57 37 
A 1918 59 39 57 j Al 

1919 32 4 26 31 2B .20 
P 1918 10 t 24 27 26 

1919 21 02 31 # 10 10 
S 1918 A5 0 18 57 AT 

1919 .B -.19 -.01 36 64 30 


Whether or not these correlations will be found valid for 
other colleges or classes remains to be proven. The relation- 
ships anticipated had been, it may be noted, between Latin 
and R tests, Mathematics and A tests, Science and P tests, 





*The coment correlation table for test scores and grades is as fol- 
lows: for 1918, for R, A, P, S and total test scores, .71, .64, 31, .48, 
and .75; for 1919, .46, .33, .26, .24, and .45. 


— 





























PSYCHOLOGICAL TESTS IN COLLEGE WOMEN 35 


and either English or History and S tests. The significance 
of the (actual) English and R test correlation is ambiguous. 
Does it indicate a combination of R tests too closely de- 
pendent upon verbal facility and vocabulary? the superior 
value of achievement in freshman English as a criterion of 
general ability?; or merely the superior reliability of the Eng- 
lish instructors’ judgments? The virtually zero correlation 
of Mathematics grades (for 1919) with all but A tests gives 
rise to further speculation. Is it possible that success ‘n college 
Mathematics presupposes, at least in women, a specialized 
talent existing, as often as not,® at the expense of or in inverse 
ratio to humanistic interests and practical abilities ? 

3. Finally, each student was asked to register the line of 
work for which she believed herself to be best adapted. In 
1918, when an option of three lines of work only was offered, 
in a class of thirty-six, 11 registered for teaching, 13 for 
clerical or secretarial work, 12 for practical. Of the 13 
registering for clerical work, 8 (61.5%) score in the upper 
third of the A test ranking; of the 11 assigned to teaching, 
6 (54.5%) appear in the upper third of the R test ratings ; 
while of the 12 signing for practical work, only 3 (25%) 
are found in the upper third of the P test ranking, although 
48% score higher in this than in any other test rating. In 
1919, of those mentioning clerical work, 62.5% rank in the 
upper third of the A tests; of those mentioning teaching, 50% 
in the upper third of the R tests; of those mentioning social 
work, 46% in the upper third of the S test scores; of those 
signing for practical work, 37.5% in the upper third of the 
P test rating, while 43.5% do better in P tests than in any 
other set. 

Conclusions. From the point of view of vocational guid- 
ance, the present study can be regarded as tentative only; 
both the experimental procedure and the securing of estimates 
demand modification, and checking up by post-graduate study 
of the individual. At the same time, the probable existence 
of distinct aptitudes which fit or unfit the individual for one 





5Against this construction may be cited the high correlations of Eng- 
lish grades with instructor’s I estimates (.71 and .64 for 1918 and 
1919) ; and with classmates (.70 and .77). 

®In this connection, it is interesting to note that the correlation be- 
tween freshman Mathematics and English for 1918, is +-.555, for 1919, 
only + .08. It is possible, of course, that in larger, more representative 
groups this relationship would exhibit more constancy. The corres- 
ponding figure quoted by Tolman (Journal of Educational Psychol- 
ogy, February, 1919) for 27 freshmen is .47+10; by Uhl, for 100, .48, 
ibid., Jan. 1919; by Bell ibid., Sept. 1916, for 750, .46. 

















36 MURRAY 


of the four lines of work mentioned seems to be fairly demon- 
strated by the degrees of difference manifest in the four test 
ratings. Whether these aptitudes are innate or acquired, 
whether they depend upon specialised abilities (unit traits), 
upon interests derived from a variety of factors, or are 
merely the by-products of varying degrees of general ability, 
is not our immediate concern. Careful weighing of the data 
in hand, however, inclines the writer to the view that degrees 
of intelligence, seconded by certain physical and temperamental 
peculiarities, play an important role. While certain gifted 
individuals may display ability along all four lines, only those 
lacking in ideas and originality are likely to satisfy them- 
selves for any period with purely clerical activities; few pos- 
sessing them are likely to content themselves with, or give 
undivided attention to, work largely manual or practical. The 
actual bar to clerical or practical activities, however, may be 
primarily defective eyes, a nervous temperament, or natural 
clumsiness of hand. It is, moreover, an interesting specula- 
tion as to whether variations in instinctive endowment, excess 
or deficiency in the social and sympathetic instincts, in curiosity 
(as manifest in alertness of the seases, and a natural turn 
for manipulation), in facile sensory motor learning, may not 
play a part in determining inclination toward or away from 
social, practical, or intellectual work ; or whether a differentia- 
tion in the degree of development of the motor centres for the 
larger and the smaller muscles may not lie at the root of a 
preference for or an aversion to practical or clerical work. 

Pending more definite knowledge along these lines, the use 
of class estimates and of tests of the variety here described 
will be found invaluable in stimulating the student to more 
careful weighing of alternatives and qualifications in the choice 
of a vocation; in accentuating the need of more adequate 
preparation; and, incidentally, in demonstrating the value of 
a wider range of choice in electives, to counteract the tendency 
toward too narrow specialisation so often revealed. 

The time and labor thus expended may be further justified 
by the amount of light thrown upon undergraduate problems 
of a more purely academic nature by the test data. General 
tendencies, such as too complete a reliance upon verbal mem- 
ory, defective habits of concentration or self-control, lack 
of expressive facility, of abstract or general ideas or of the 
ability to form spontaneous generalisations, habits of care- 
lessness, the tendency to sacrifice accuracy to speed, an unde- 
veloped social consciousness, inefficiency in handling one’s 
ordinary working materials, may all be isolated and magnified 

















es 











a ae eee 





Lr 


Pah ys) SN le at 


—— 











PSYCHOLOGICAL TEST’S IN COLLEGE WOMEN 37 


by the range of tests here advocated, and the student advised 
accordingly. If the individual is looking forward to teaching, 
or to clerical or secretarial or social work, the professional 
motive may be utilized to stimulate effort along lines where 
weakness in the necessary elementary processes or qualifica- 
tions can be demonstrated. 

In the case of students involved in academic difficulties, the 
R tests may disclose a coefficient of intelligence so low as to 
render a satisfactory solution of their college problems un- 
likely ; at the same time their A, P or S scores may indicate 
a practical (or other) ability sufficient to warrant the attempt 
to divert their ambitions from the conventional teaching career 
to lines of work more suited to their actual abilities. On the 
other hand, the discovery of a high intelligence coefficient in 
the unambitious may be utilized to induce the adoption of a 
more serious attitude toward college opportunities. 


Summary 


1. Scores obtained from a series of tests distributed over 
a number of months in the junior and senior years, when 
pooled in four sets to measure General Intelligence (or Teach- 
ing Ability), Accuracy (or Clerical Ability), Practical and 
Social Ability, respectively, afford evidence ot decided dis- 
similarity in the four functions measured. 

2. Classmates’ estimates of each other’s ability furnish addi- 
tional information of vocational interest, and to a certain 
extent corroborate the findings of the tests. 

3. College grades based upon freshman and sophomore 
courses appear on the whole to be less reliable than those of 
later years’ as indicators of real ability and differentiation 
of talents. 

4. Comparison of the student’s own vocational choice with 
the various test scores reveals a fairly high degree of corre- 
spondence between individual ambition and experimental 
findings. 

5. The test data accumulated are applicable to a variety of 
academic problems. 


BIBLIOGRAPHY 


1. Bell, J. C.: Mental Tests and College Freshmen, Jour. Educ. 
Psych., VII, 1916. 

2. Bingham, W. V.: Some Norms of Dartmouth Freshmen, Jour. 
Educ. Psych., VII, 1916. 





_ This point is developed further, with reference to the major sub- 
jects pursued by the individual, in a fuller account of this investigation 
to be published later. 

















anew & 


—_— ee 
ao ak DDH KS eo N 


8S 


21. 
22. 


SBRBYRRE BS 


35. 


MURRAY 


Ringo W. V.: Mentality Testing of College Students, Jowr. 

pplied Psych., I, 1917. 

Brown, W.: Some. Experimental Results in the Correlation of 
Mental Abilities, Brit. Jour. Psych., 


, | J. McK.: Homo Scientificus ‘Americanus, Science, XVII, 


Cattell, J. McK. and Farrand: Physical and Mental Measurements 
of Sindents Psych. Rev., Ill, 1896. 


Down J.: Form and Position in Handwriting, Jour. Educ. 
Rea. 1915. 
—_ J. K.: Statistical Study of Character, Ped. Sem., XXIV, 


Healy, W., and Fernald, C.: Test for Practical Mental Classifica- 
tion, Psych. Rev. Mon., XIII, No, 54, 1911. 


. Hollingworth, H. L.: V ocational Psychology, 1916. 
. Kitson, H. D.: Scientific Study of the College Student, Psych. Rev. 


Mon., XXIII, 1 , 1917. 


. Link, H. C.: Ae Experiment in Employment Psychology, Psych. 


Rev., XXV, 1918. 


. McCrory, J., and i I.: Freshman Tests at the University of 


Iowa, Jour. Educ. Psych., 1X, 1918. 


. Norsworthy, N.: On the Validity of Judgments of Character, 


Essays in Honor of William James, 1908. 


. Pintner, R.: Standardization of Knox Cube Test, Psych. Rev., 


XXII, 1915. 


: gs E., and Lowden, G.: Psychological Tests at Reed Col- 


Rusl’s Jour. a Psych., 1916. 
B.: ility of Mental Tests in the Division of an Aca- 
demic Bones Psych. Rev. Mon., XXIV, 1917 


. Sharp, S.: Individual Psychology, Amer. Jour. Psych., X, 1899. 
. Simpson, B. R.: Correlations of Mental Abilities, 1912. 
. Stenquist, Thorndike, and Trabue: Intellectual Status of Children 


who are Public Charges, 1915. 
Stern, W.: Differentielle Psychologie, 1911. 
Terman, L. M.: Measurement of Intelligence, 1916. 


. Thorndike, E. L: Introduction to the Theory of Mental and Social 


Measurements, 1913. 


. Thorndike, E. L.: Educational Di osis, Science, 1913, p. 133-. 


Thompson, H. B.: Mental Traits of Sex, 1903. 
Washburn, M. F.: A Study of Freshmen, Amer. Jour. Psych., 1917. 


. Webb, E.: Character and Intelligence, 1915. 
: Weidensall, J.: The Mentality of Criminal Wioman, 1916. 
. Whipple, G. M.: Manual of Mental and Physical Tests, 1915. 


White, W. A.: Outlines of Psychiatry, 1912. 


, — M. T.: An Empirical Study of Certain Tests for Indi- 


ual Differences, 1911. 


. Wissler, C.: Correlation of Mental and Physical Tests, Psych. 


Rev. Mon., III, No. 16, 1901 


. Woodworth, R. S., and Welis, F. L.: Association Tests, Psych. 


Rev. Mon., XIII, 57, 1911. 


. Woolley, H. T. and Fischer, C. R.: Mental and Physical Meas- 


aa of Working Children, Psych. Rev. Mon., XVIII, 77, 


Yerkes, R. M., Bridges, J. W., and Hardwick, R. S.: A Point 
Scale for Measuring Mental Ability, 1915. 
































THE APPLICABILITY OF MENTAL TESTS TO 
PERSONS OVER FIFTY YEARS OF AGE 


By JosepuHine Curtis Foster,! Formerly Chief Psychologist, and 
Grace A. Taytor,! Formerly Psychological Interne, Psycho- 
pathic Hospital, Boston 


In psychological examinations made at the Psychopathic 
Hospital it has for some time seemed evident to the examiners 
that the majority of patients over 50 years of age do especially 
poorly in the tests of memory and in certain allied tests. 
Casual observation and another study? showed further that 
this deficiency depends relatively little upon type of mental 
disease and relatively much upon chronological age. The 
present paper is an account of our attempt to find reasons 
and numerical expression for the differences among our 
patients. We have attempted also to check our results by 
studying the influence of chronological age in the case of nor- 
mal persons. 


Cases Used and Their Comparability 


The following groups of cases have been studied. No 
foreign-speaking adults and no children of foreign- speaking 
parents are included in any group. 

(a) 106 older® patients in a large general hospitai in Bos- 
ton (not an insane hospital). 

76 per cent of this group are men. So far as our problem 





1 Our thanks are due to the following physicians whose cooperation 
made possible the work on the normal old persons: Drs. John J. 
Dowling, Edmund W. Wilson, John L. Ames, John Bapst Blake, W. E. 
Faulkner, E. N. Libby, Edwin A. Locke, H. A. Lothrop, Paul Thorn- 
dike, and Townsend Thorndike. We wish also to thank Dr. F. H. 
Thomas for allowing us to work at the Foxboro State Hospital, and 
Dr. R. M. Yerkes for allowing us to use the data on school children. 

2To be published shortly by J. C. Foster under the title “Significant 
Responses in Certain Memory Tests.” 

8’ Older, in this paper is taken to mean 50 or more years of age. 
Younger is to be understood similarly as less than 50 years of age. 


39 











40 FOSTER AND TAYLOR 





is concerned, however, a study of the cases reveals no signifi- 
cant influence of sex upon results. 

This hospital charges up to $10 a week where payment 
is exacted, but admits patients free when their families are 
unable to pay for treatment. 59 per cent were manual laborers. 
The average social status of the patients of this group is, 
therefore, relatively low. 32 per cent of them, for instance, 
left school before the 5th grade, and 85 per cent of them 
before the 9th grade. 

All of these patients were given the Yerkes-Bridges Point 
Scale by the second writer. Our original plan was to include 
also several additional memory tests, but they had to be aban- 
doned. In the first place, many of the patients had too poor 
eyesight to be given any “visual verbal” tests. But more 
than this, many of the older persons objected to attempting 
any memory test. At the mere hearing of the directions 
“when I am through reading I want you to tell me as much 
as you can of what I read” many of this group made such 
remarks as “Oh, no! I couldn’t do that ” or “Oh, no! my mem- 
ory is too poor for that” or “ That is child’s play, don’t try 
anything like that on me.” The true reasons for refusal evi- 
dently were the conviction that their memory was poor and a 
reluctance to display this weakness. In general we found in 
working with normal older persons a similar reluctance to 
undertake other “ difficult” tests. Some half dozen cases, for 
example, had to be discarded because the patient refused to 
attempt any test after the twelfth, that is to say, when the 
thirteenth was given, they would not attempt it, nor would 
they attempt any succeeding test. We were therefore forced 
to be content with giving the Point Scale alone. i 

(b) 315 normal young men between ages 20 and 30. This 
group is distinctly inferior in social status and education to 
the above group. 79 per cent were manual laborers. 65 per 
cent of them left school before reaching the 5th grade, and 
96 per cent left before reaching the 9th grade. All of these 
cases were given the Point Scale (omitting tests 14 and 18) 
by other examiners, but all scoring was checked by us. Tests 
14 and 18 were omitted because the majority of the group 
had too little schooling to be able to do any of the tests involv- 
ing sentence construction. 




















ae eee 























MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 41 


(c) 316 normal school children between ages 10 and 19. 
These cases were from the Cambridge public schools and were 
tested in obtaining the original norms for the Point Scale.‘ 
Their average social status is at least as high as that of 
group (a). 

(d) 136 older persons, patients diagnosed psychotic either 
at the Psychopathic Hospital or at the Foxboro State Hos- 
pital. Their social status is comparable with the patients of 
group (a). It is possibly slightly superior. 60 per cent of 
the group are men. 27 per cent only were manual laborers. 32 
per cent left school before reaching the 5th grade and 75 
per cent before reaching the 9th. All were given the Point 
Scale by examiners of the Psychopathic Hospital. 

(e) 151 younger persons, patients diagnosed psychotic at 
the Psychopathic Hospital. 52 per cent of the group are 
men. 22 per cent only were manual laberers. 17 per cent left 
school before reaching the Sth grade and 74 per cent left 
before reaching the 9th. There is no reason to suppose that 
the social status of this group is different from that of the 
above group (d). All were given the Point Scale by exam- 
iners at the Psychopathic Hospital. 


Comparison of Older Psychotic and Younger Psychotic 
Persons (Groups d and e) 


If we temporarily disregard type of mental disease, we find 
that the younger group attains on the average a higher score 
on each test as well as a higher total score. The superior- 
ity of the younger patients is, to be sure, most marked in 
tests 13 (words in three minutes), 14 (three words in one 
sentence), 16 (drawings from memory), and 18 (dissected 
sentences), but we cannot be sure how mucli the results are 
affected by type of disease and by total score (or mental age). 

If, in an attempt to eliminate the influence of total score, we 
fractionate our cases by total score groups, we find two such 
score-groups sufficiently large for use,—53 to 60 (mental age 
10) and 72 to 79 (mental age 13). The data for these groups 
are given in Table 1. 








*See Yerkes, Bridges, and Hardwick, A Point Scale for Measuring 
Mental Ability, 1915, Chap. 4. 














42 FOSTER AND TAYLOR 


TABLE 1 
AVERAGE SCORES OF OLDER AND YOUNGER PSYCHOTIC PERSONS 











Total Score 53-60 72-79 
Age Groups Older Younger Older Younger 
Number of Cases 15 18 7 21 

Test 1 3.0 3.0 3.0 3.0 
2 3.1 3.2 3.7 4.0 

3 3.0 2.9 2.8 3.0 

4 3.2 3.2 4.1 4.0 

5 2.8 3.9 4.0 3.9 

6 3.4 2.3 4.2 3.2 

7 6.6 8.0 8.5 8.6 

8 8 1.1 15 19 

G 3.8 4.0 5.0 5.3 

10 4.2 4.5 6.4 6.1 

ll 1.8 2.4 2.7 2.0 

12 3.1 3.6 4.0 3.8 

13 1.4 13 2.2 2.6 

14 8 A) 2.2 3.1 

15 5.5 43 7.1 7.0 

16 Jl 7 7 1.7 

17 2.2 14 3.2 3.9 

18 K 13 3.1 4.1 

19 3.2 2.0 5.1 4.1 

20 14 2. 2.8 2.7 


From the results given in Table 1, we see that the seven 
tests in which there is a decided difference in score obtained 
by the two age-groups (older and younger) are 6, 8, 14, 15, 16, 
18, and 19. The younger attain in test 16 (drawings from 
memory) 343 per cent of the score of the older, in test 18 
(dissected sentences) 164 per cent, in test 8 (arrangement 
of weights) 130 per cent, and in test 14 (three words in one 
sentence), 133 per cent. In test 6 (repetition of sentences), 
on the contrary, the corresponding figure is but 72 per cent, 
in test 15 (comprehension of questions) 90 per cent, and in 
test 19 (definition of abstract terms) 74 per cent. These 
percentages must not be taken to mean too much because 
thus far we have paid no attention to character of disease 
but have grouped all psychotic cases together indiscriminately. 
We have, of course, an entirely different proportion of some 
mental diseases, such as senile dementia and arteriosclerosis, 
in the two groups. The effect of this factor, therefore, may 
be cutting across the effect of chronological age. 

The effect of type of disease can be eliminated only by 














MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 43 


considering each disease separately. Unfortunately we have 
only one diagnosis where our cases are sufficiently numerous 
to warrant such comparison. That disease is dementia prae- 
cox. We tried many groupings of total score and of chrono- 
logical age in these cases and arrived finally at seven sub- 
groups each of which had the same range of total score but 
different chronological ages and each of which contained at 
least 10 cases under each age-group. The number of cases 
in some of these age-groups was as great as 52. The sub- 
groups are: Score 46-71, age 10-19 and 20-29; Score 46-71, 
age 10-29 and 30-49; Score 46-71, age 10-39 and 40-69; Score 
72-85, age 10-29 and 30-49; Score 72-100, age 20-29 and 
30-39 ; Score 72-100, age 10-29 and 30-49; and Score 72-100, 
age 10-39 and 40-69. Score 18-71 gives several groups 
but is too great a range for reliable comparisons. The 
comparison of the first twelve tests in the scale reveals nothing 
decisive, for now a younger, and now an older, group appears 
to be slightly superior. When, however, we come to the thir- 
teenth test (words in three minutes) we find that in 5 of the 
7 sub-groups the younger are uniformly better. The younger 
ages with entire uniformity give higher scores in tests 14 
(three words in one sentence), 16 (drawings from memory), 
and in 18 (dissected sentences). Their score is greatest in the 
16th test (about 200 per cent of the score of the older per- 
sons). The older ages, on the other hand, are superior in 
tests 15 (comprehension of questions), 17 (detection of ab- 
surdities) and 19 (definition of abstract terms). Their best 
performance is in test 17 (about 125 per cent). We have not 
given the complete data in these cases because it does not seem 
worth while in view of the relatively small number of cases 
in the groups and in view of the fact that with dementia 
praecox we may not be dealing with a unity after all. 

The general result is that judging from our psychotic cases 
alone and given equal general intelligence, patients over 50 
years of age excel those under 50 in comprehension of ques- 
tions, detection of absurdities, and definition of abstract terms ; 
while they are inferior to the younger patients in construction 
of sentences and drawings from memory. 


Comparison of Older Psychotic and Older Normal Persons 
(Groups a and d) 

In the preceding section we have been careful to insist that 
we have been dealing with results on psychotic patients and 
that it may not be justifiable to infer from them to normals. 
The reader may also be tempted to argue that the changes 








tit—Mmm—a—_——————————_, 











44 FOSTER AND TAYLOR 


in score which appear with advancing chronological age may 
be due, not to the influence of age itself, but to the influence 
of the length of time which a patient has suffered from 
mental disease. That is to say, it is to be expected that de- 
terioration will continue with the progress of a mental disease. 
We may have been measuring simply the influence of the 
course of the disease. If so, the psychotic older persons should 
differ markedly from normal persons of the same age. The 
next step, therefore, will be to compare older psychotic persons 
with their normal contemporaries. 

Our groups of psychotic and normal older persons (groups 
a and d) are of approximately the same social level, with the 
former holding the possible advantage. In other regards the 
two groups are also very similar. 

If we consider the groups as a whole and disregard both 
total score and type of disease, we find that the normal old 
have a higher average total score than the psychotic old, and 
that in most cases they have a higher average score on each of 
the tests. The exceptions are test 13 (words in three min- 
utes) where the averages are the same, and test 14 (three 
words in one sentence) where the older psychotic patients sur- 
pass the older normal persons by one tenth of a point. This 
result in part answers the question we raised above in regard 
to the deterioration of the older psychotic persons. The 
older psychotic persons do show a lower intelligence rating 
than their normal contemporaries. Since they also have, if 
anything, a higher social rating, we must conclude that this 
greater deterioration is due to the progress of mental disease. 
A question remains, however. Is their greater deterioration 
a quantitative difference simply, or have the psychotic persons 
fallen off in certain abilities while retaining others to a normal 
degree? And further, are our results true for all diseases 
or merely for their average? 

The last question may be answered first and most easily. 
Our cases include dementia praecox, unclassified paranoid 
psychosis, syphilitic psychoses, acute alcoholic and deteriorat- 
ing alcoholic psychoses, manic-depressive insanity and senile 
dementia. If we consider these diseases separately and com- 
pare the results for one with the average attainments of normal 
older persons, we find that the dementia praecox, senile de- 
mentia, and arterio-sclerotic persons differ most from normal 
persons of their age, while manic-depressive and unclassified 
paranoid cases differ least. There is no disease, however, 
which gives results contradictory to our earlier figures for 
the two groups as wholes. 




















ee 


MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 45 


We come now to the question of the particular tests in 
which these changes are most marked. To reduce the effect 
of total score, we must once more fractionate our cases 
accordingly. We then find two groups with sizes sufficiently 
great for reliable comparison: those with total scores between 
53 and 60, and those with total scores between 76 and 79. 
From the results thus arranged it appears that the low rating 
of the psychotic cases in some of the tests, at any rate (such 
as words in three minutes, three words in one sentence, and 
drawings from memory) is not due to the fact that the patients 
are psychotic, for normal persons of the same general age, and 
of the same mental rating, do even more poorly in these par- 
ticular tests than do psychotic persons. 

If, then, the particular failures are due not to the disease but 
to chronological age, we should get the same general results 
from the consideration of normal cases. 


Comparison of Older Normal and Younger Normal Persons 
(Groups a, b, and c) 


The results for our normal older persons are briefly sum- 
marized in Table 2. 


TABLE 2 


AVERAGE POINT SCALE SCORES OF OLDER NORMAL PERSONS 





Chronological Age 50-59 60-69 70-79 80-89 








Number of Cases 55 34 13 4 
Test 1 3.0 3.0 3.0 3.0 
2 3.8 3.7 3.7 3.0 

3 2.9 2.9 2.9 3.0 

4 3.8 3.3 3.3 3.5 

5 3.8 3.7 3.7 3.3 

6 4.1 3.9 3.4 3.0 

7 7.8 7.3 7.2 8.0 

8 1.8 1.8 1.6 2.0 

a 5.5 5.6 5.2 3.0 

10 5.9 5.6 5.0 4.0 

ll 2.6 2.5 2.3 1.8 

12 3.7 3.1 3.4 3.3 

13 1.6 1.9 8 1.8 

14 1.3 1.0 7 5 

15 6.7 5.8 5.5 4.8 

16 9 3 3 A) 

17 3.4 2.9 3.1 2.3 

18 2.6 2.3 7 5 

19 4.4 3.5 2.9 1.5 

20 2.3 2.2 1.3 1.0 

Total Score 71.9 66.3 59.8 53.3 

















FOSTER AND TAYLOR 


46 


f cases 


rage scores unreliable. 


however, is 
, the particular tests 
ars or in which it appears 


, the small number o 


to the three younger 


In these three groups it is evident that with 


general the total score falls 


cal age among normal persons. 


ogi 


Of greater importance than the decrease itself 
the manner of the falling off, that is to sa 


in which lowering of score first appe 
In the table 


From this table we see that in 
off with advancing chronol 


in the highest age-group makes their ave 
age-groups. 


We shall, therefore, confine ourselves 


in greatest degree. 


TABLE 3 


AVERAGE POINT-SCALE ScoRES For NormMAL Sus 


JECTS OF THREE 
AGE GROUPS 
Test 

Ron ae SS Se Eg Se 


1 


Chron. 
Age 


Total 
Score 














MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 47 


advancing chronological age there is a tendency for the scores 
in each test to fall off. This tendency is most marked in 
tests 16 (drawings from memory), 18 (dissected sentences), 
14 (three words in one sentence), and 13 (words in three 
minutes). It is least marked; on the other hand, in the very 
easy tests (1, 2, 3, and 5, aesthetic comparison, missing parts, 
comparison of lines and weights, and counting backwards) 
and in test 17 (absurdities). 

When we had reached this point in our investigation, we 


TABLE 3 (continued 
AVERAGE PoINT-SCALE SCORES For NORMAL SUBJECTS 





OF THREE AGE GROUPS 
Test No. of 
: oo wee a a. BS! Rt Cases 
ie * Tee a’ ge” ies Ge Bae a 2 
ean 6 ® 2 6A 18. ff 811 
2628 4 034 0 0 024 2 5 
18 29 20 19 32121514 712 61 
19 3.122 7481220 71910 6 
2434 7 044 019 129 8 z) 
2.2 3.2 26 2.7 38 1717191516 39 
25.33 33 7 S215 20 ? 231.1 @ 
73°38 2 oe Gt 223 13 3S 12 OS 
ae S24 Se 25.45 21 21 26 23 23 3 
Sl es I eek ae fk GR ee ee 
26 36 20 .7 60 .7 36 13 23 2.0 7 
2.6 3.5 26 36 5.1 1.7 36 3.2 3.0 24 21 
26 3.722 ?611632 ?3018 27 
26 39 231266 3 35 2.0 30 25 8 
2.5 3.7 33 36 53 253235 33 28 31 
203524 2? 711836 24121 34 
3.0 401615 68 6 44 40 44 18 5 
2.5 3.6 34 38 7.1 23 3.9 38 35 3.2 19 
2339 28 ?6625 40 ?4126 2 
25 37. 18.27 72. 2-43.31 48 238 
26 3.9 38 3.8 7.0 3.2 40 5.1 48 3.9 69 
2.7 3.9 3.1 ap a2 44.7? 48 39 22 
28 3.7 25 28 79 2.1 48 49 58 46 17 
AVERAGE FOR THE ABOVE EIGHT GROUPS 
2.3 3.3 28 29 48 2.0 26 2.7 24 23 316 
a3 Be ae Tl O88 19 28) 6 fl 38 19 315 
2736151358 5 3.1 21 36 20 77 








48 FOSTER AND TAYLOR 


realized that to make the study complete we needed records 
of normal persons under 50 years of age of the same general 
class as our group of older normal persons. It is regrettable 
that we had not time to continue our work at the General 
Hospital in which we obtained our older normal cases. We 
were obliged to use records obtained by other experimenters 
from other sources. The only records of persons between 
30 and 50 years of age which we could easily consult were 
those of persons so superior intellectually to the cases above 
considered that they were incomparable. There is, therefore, 
a gap at ages 30-50 in our Table (3) which summarizes the 
results of all our normal subjects, including school-children, 
young men and older persons. 

No grade could be given to the school-children in test 6 
because the form of this test has been changed since they 
took the examination. Neither could any grade be given the 
young men in tests 14 and 18 because so many of this group 
were illiterate that these tests were omitted. For this group 
(d), therefore, the groups of total scores in the left hand 
column should read 46-50; 51-56; 57-61 ; 62-65; 66-68; 69-71; 
72-74; and 75-90. In the last section of the table we have 
given the averages of the eight total-score groups. This, to 
be sure, is an “average of averages.” Even so, however, 
it is more significant than would be an average of the original 
scores. In the first place, we could not use the simple average 
of scores obtained on each test by each of our age-groups 
because there were so many more high scores among the 
younger that they would have appeared to too great an advan- 
tage. We, therefore, divided our cases in groups by total 
score attained. The averages for these groups may then be 
treated as if they had been “corrected” and as if there were 
an equal number in each total score group, and our “ average 
of averages” is, then, the average score attained on each 
test by each age-group, supposing the distribution of total 
scores for the age-groups to be the same. 

From Table 3 it appears that tests 1 (aesthetic compari- 
son), 3 (comparison of lines and weights), 5 (counting back- 
wards), 7 (description of pictures), 9 (comparison of terms), 
and 20 (analogies) show little or no regular change with 
advancing chronological age. Tests 2 (missing parts), 4 
(memory span for digits), 6 (memory span for sentences), 
and 10 (definition of concrete terms) show a slight tendency 
toward decrease with advancing age. Tests 8 (comparison 
of weights), 11 (line suggestion) and 12 (copyi=# square 
and diamond) show a slight tendency for the score *. increase 











MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 49 


with advancing years. The other tests show decided tenden- 
cies. Tests 13 (words in three minutes), 14 (three words 
in one sentence), 16 (drawings from memory), and 18 (dis- 
sected sentences), all show a marked falling-off in score as 
the chronological age increases. This is most marked in 
test 16 (drawings from memory). On the other hand, the 
remaining tests, 15 (comprehension of questions), 17 (ab- 
surdities), and 19 (definition of abstract terms), show an in- 
crease in score with increased age. From the results we may 
conclude that the improvement in the ability to comprehend 
questions comes fairly early, since our young men are so far 
superior to the school children and are practically the same 
as the older persons. The improvement in absurdities and 
definitions of abstract terms, on the other hand, seems fairly 
regular. The falling-off in giving words in three minutes 
seems regular, but in drawings from memory, the score does 
not show any decided decrease until late, that is, the young 
men differ but little from the school children, while the older 
persons are decidedly inferior to the young men. Here again 
we regret that the lack of sufficient data on persons between 
30 and 50 prevents us from more than guessing at the points 
at which the various abilities show changes. 

The main conclusion to be drawn from our work thus far 
is that whether we study psychotic or normal persons with 
approximately the same total score, the younger persons tend 
to excel the older in giving words in three minutes, in building 
sentences and in drawing from memory; while the older excel 
the younger in comprehending questions, in detecting ab- 
surdities, and in defining abstract terms. Our results were 
so consistent throughout that we next turned to the literature 
to see if experimenters who were looking for something else 
had results which agreed at all with ours. 


Results from Previously Published Work 


In our paper on “ Significant Responses in Certain Mem- 
ory Tests,” (referred to in footnote 2), in which we consider 
only Psychopathic Hospital cases, some of whom were diag- 
nosed as “not insane,” we find no uniform changes with in- 
creasing chronological age in the score of memory span for 
digits. We do find, however, a decided tendency for the score 
of drawings from memory thus to decrease. The point at 
which this decrease begins differed for the different mental 
diseases, but in all it begins before age 50. A similar decrease 
is found in the scores of memory for short paragraphs. 

Our results in general are also confirmed by a study of the 











50 FOSTER AND TAYLOR 


contributions of two other writers. Both of these are reports 
of Binet examinations (one, the Goddard, 1911, and the other, 
the Stanford revision). The differences in grading and in 
the tests themselves make a rough comparison the only one 
possible. 

An article by Wender® gives much valuable material, though 
his conclusion on the basis of his selected cases that his data 
furnish proof for the necessity of a revision of the scale 
appears unjustified. His Table II shows the tests passed by 
30 cases of senile dementia or arterio-sclerosis, with an aver- 
age chronological age of 74 and an average mental age of 
9.4. From it we have calculated the percentage of the cases 
which passed each test which corresponds to a test on the 
Point Scale. The results of such calculation are given in 
Table 4. 


TABLE 4 
PERCENTAGE OF WENDER’S CASES PASSING TESTS SIMILAR TO PoINT SCALE 


Binet 1911 Test Men aay Percen 
Point Test ag 





VI, 5 1 87 
VIII 5.x } XII, 1 : Pa 
Vint s 5 67 

VII, XV, 1 7 67 
I 8 13 

Ve IX, 2 0 * 
II, 4 ll 17 

VIL 4 12 67 

xI,'3 13 B 

X, 5; XI, 2 14 27 
x4 15 57 

x'2 16 7 

XI 1 17 33 

xi, 5 18 27 

XII, 2 19 83 


The most striking points about Table 4 are that test 19 (a 
12-year Binet test) should have been passed by 83 per cent 
of a group whose average mental age was only 9.4 and that 
test 16 (a 10-year test) should have been passed by only 7 
per cent of the same group. Tests 2 and 8 (missing parts 
and arrangement of weights) also give a low average when 


5“ The Applicability of Binet-Simon Tests in Psychoses of the Sen- 
ium.” N. Y. Medical Journal, March 6, 1915. 














| # 
| 
7 
| 





MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 51 


we consider the age level to which they presumably belong. 
Test 8 was one of the tests in which our younger psychotic 
patients were superior to the older psychotic patients. Test 2 
was found to change very little until we reached the 80-89 
group of our normal older persons. But, other than this, 
our results do not seem to uphold Wender’s findings for 
tests 2 and 8. The successes in Wender’s group are more 
striking than the failures. We have already mentioned the 
remarkable performance in test 19 (definition of abstract 
terms). Tests 15 and 17 (comprehension of questions and 
absurdities) also give very high percentages. Test 15, a 
10-year test, is passed by 57 per cent of the cases; and test 
17, an ll-year test, by 33 per cent. These three tests, it 
will be remembered, were the three which we found to be 
particularly easy for the older persons, and we may say, 
therefore, that Wender’s work supports our results in so far 
as the two have anything in common. 

The second contribution mentioned is in Terman’s statis- 
tical account of the bases of the Stanford revision. From 
this account we have taken the results on Knollin’s unem- 
ployed men (hoboes), chronological ages 21-60, but chiefly 
25-40, mental ages 10-18. The results are summarized in 
Table 5. 

TABLE 5 


PERCENTAGE OF KNOLLIN’S CASES PASSING TESTS AT DIFFERENT MENTAL 
AGES 





Stanford Revision Percentage Passing at Corresponding 
est Mental Ages Point Scale Test 


10 11 12 13 14 15 16 18 


IX, 2 70 79 8 
X,'6 40 60 56 60 13 
1X, 5 30 67 75 % 14 
x’5 70 85 98 100 100 15 
x'3 41 52 67 70 73 16 
x'2 50 60 82 81 17 
XIi, 4 0 27 54 63 18 
XI 2 50 90 83 97 19 


The blank spaces in Table 5 seem to mean that if the table 
were completed, there would be zeros in the blanks to the left 
of the figures given and hundreds to the right. It is, of 
course, impossible to determine what these subjects might have 





® Terman and others, “The Stanford Revision and Extension of the 
Binet Simon Scale for Measuring Intelligence,” 1917, p. 163 ff. 

















52 FOSTER AND TAYLOR 
done on the other comparable tests of the Point Scale because 
the arrangement of the Stanford is such that they were not 
given.” Test 18 (dissected sentences) is very difficult for the 
Knollin group. Although it is a 12-year test, only 27 per 
cent pass at age, and only 63 per cent of cases with a mental 
age of 14 pass. Likewise test 16 (drawings from memory) 
is difficult, for although a 10-year test, only 41 per cent of 
those with mental age 10 pass it, and only 73 per cent of 
mental age 14 pass. Test 14, again (three words in one sen- 
tence), is done but poorly. On the other hand, test 15 (com- 
prehension of questions), gives percentages which are normal 
for its place in the scale and test 17 (absurdities) is not far 
behind. Test 19 (definition of abstract terms) is evidently 
done better than we should expect from school children. We 
have, then, roughly, though not as decidedly as for Wender’s 
older cases, the conclusion that tests 14, 16, 18, and probably 
13 (three words in one sentence, drawings from memory, 
dissected sentences, and words in three minutes) are difficult 
for hoboes of chronological ages between 25 and 40, and that 
test 19 (definition of abstract terms) is easy for them. Fur- 
thermore, if we compare these cases with those for Williams’ 
juvenile delinquents,* ages mostly 14 to 21, we find that the 
hoboes do much better than the delinquents in test 19 (defini- 
tion of abstract terms), and that the delinquents far excel 
the hoboes in tests 13, 16, and 18 (words in three minutes, 
drawings from memory, and dissected sentences). The same 
tendency holds true if we compare them with still younger 
cases. The only comparison possible between the business 
men and the High School pupils® is on tests 18 and 19 (dis- 
sected sentences and definition of abstract terms). Test 19 
is passed by 1 per cent more pupils than business men, but 
in test 18 this percentage rises to 10, showing that test 18, at 
least, is harder for business men. That is to say, the falling 
off in ability to put together dissected sentences has begun 
already in middle aged business men. In the comparison of 
abstract terms (a Stanford test which is similar to the Point 
Scale definition of abstract terms) 13 per cent more business 
men pass than do pupils. We find again then that younger 


™The above seems to the writers a strong argument for the use of 
scales of the type of the Point Scale in all cases where there is a prob- 
ability of wide “scatter.” In year scales there is far greater chance of 
passing over some defect or peculiarity simply because it is not expected 
at the chronological or mental age of the subject. 

® Terman and others, op. cit., p. 170 ff. 

%op. cit. p. 171 ff. 























MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 53 


persons consistently excel older in certain tests and are inferior 
to them in others. 

The results of other investigators, therefore, although ob- 
tained for quite different purposes, agree almost absolutely 
with our findings as to the influence of chronological age. 


Probable Reasons for the Differences Between Older and 
Younger Persons 


The above data show with reasonable certainty that there 
are decided changes in the distribution of abilities (as shown 
by the Point Scale) as persons get older. These changes 
appear whether we compare the insane or the normal, whether 
we lump all our cases together or fractionate them by par- 
ticular disease, whether we compare the very old with the 
very young, or whether we compare the middle-aged with 
young adults. The only condition which must be observed 
is this one: that only those whose total score, or general level 
of intelligence, is approximately the same may be reliably 
compared. If we do not make this restriction, the young will 
be found to excel in each test as well as in the total score. 

The next question is: Why do we find this change in 
ability as persons get older? It is evident, of course, that 
as a person grows older he loses some of the abilities he had 
as a child or young adult. We are all familiar with aged per- 
sons who fail to remember what happened yesterday, but who 
expect their grandchildren to recall, as they themselves do, 
events which happened many years ago. We say commonly 
that old people have poor recent, but good remote memory, 
but we seldom inquire into their abilities outside the field 
of memory. 

In the present paper we have tried to determine more ex- 
actly the abilities and disabilities of the old and to estimate 
roughly, at least, the age at which changes are most marked. 
It is to be supposed that a defect in memory which is clearly 
present in a person of 80 must have been coming on for 
some time. 

We have found that in the abilities which are tested in the 
Point Scale the old persons have deteriorated much more in 
some than in others. The reasons for the individual losses 
we conceive to be three. In the first place, there is some 
actual loss in ability. This is shown particularly in the draw- 
ings from memory. From our practical acquaintance with 
aged persons we are forced to conclude that they are actually 
unable to recall certain recent impressions. In the second 














54 FOSTER AND TAYLOR 














































place, there is a lack of practice in certain kinds of per- 
formance. Such, for example, we take to be the case in the 
construction of sentences. There seems to be a possibility 
that, given sufficient incentive and sufficient practice in this 
test, an older person may equal the performance of his juniors. 
The difficulty is that the common incentives such as praise, 
approval, etc., which are so effective with children, are of 
little avail with the old. We come now upon the third point 
which is probably the key of the whole problem. The younger 
subjects are almost invariably more alert and interested. Their 
experience is such that they fit more naturally into the test 
situation. They appear more adaptable than the older ones. 
Moreover, the tests in which they excel are those which most 
resemble “ stunts ” or “ puzzles ” and which, therefore, require 
not only willingness, but also a rapid adjustment of the sub- 
ject. If we consider the tests in which the older subjects are 
superior we find them to be the ones which are more like 
the problems which arise in the daily life of adults and which 
could be answered best by persons who had had the accumu- 
lated experience of years. 

There seems, therefore, to be such a decided break between 
the older and the younger persons that it is not fair by the 
former to grade them by an examination intended primarily 
for adolescents. The whole question of the applicability or 
fairness of any such examination to older subjects therefore 
depends on the purpose for which it is given. 

The purposes of examinations of persons over 50 seem to us 
to be two: first, the determination of the degree of deteriora- 
tion or aberration present; second, the determination of the 
presence of feeble-mindedness. In both cases there are two 
possible standards of comparison, namely their supposed 
former ability (or that of average normal young person) and 
the average ability of their contemporaries. It is without 
doubt interesting to note that a person who once had a mental 
age of 18 has now one of only 10. It is, however, of much 
greater importance to know whether the average person of 
the same present chronological age and of the same former 
mental age has deteriorated to the same degree. If the patient’s 
deterioration is the same in amount and kind as that of his 
normal contemporary, then we cannot lay that deterioration 
to the presence of mental disease or to initial feeble-minded- 
ness. Moreover, if the history of an old person convinces 
us that he has always been of a low grade mentally it is 
often desirable to know the maximum mental age which the 
patient ever attained. Our best guess here would be based 














MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 55 


on those tests in which the normal contemporary has not 
shown deterioration. 

In most work with psychological examinations we have the 
constant difficulty that many non-psychological persons (and, 
alas ! some psychologists) take our results as too simple. They 
read the mental age, without taking any notice of the com- 
ments of the examiner. If the mental age is less than 12 they 
glibly diagnose the patient feeble-minded. In order to cir- 
cumvent such hasty diagnosticians and in order to give a 
mental age which shall more exactly express the ability of the 
subject before the influences of old age became marked, we 
have calculated some allowances which should be made in the 
case of persons over 50 years of age. 


Suggested Score Corrections for Old Age 


Already at this hospital we had been in the habit of making 
allowance for the omission of certain tests and we now applied 
the same method to a scheme for discounting the effect of 
advancing years. Perhaps it will be as well to give the history 
of the previous work, so that the present calculations will 
not seem too fanciful. To be sure, the plan we are about to 
present has obvious faults and we can claim for it no more 
than fairly satisfactory results. We give it here in the 
hope that the idea will lead some others to similar work and 
will in the end result in an accurate and theoretically correct 
table. 

The first problem of the kind which arose was the question 
of how to grade patients who were totally deaf and who, 
therefore, could not be given tests 4 and 6 (auditory memory 
span for digits and sentences). Our procedure at first was 
to add the scores with these tests omitted, call that the mini- 
mum mental age, then add to that the highest score obtainable 
on the two omitted tests, call that the maximum mental age, 
and then say that the true mental age lay somewhere between 
those two limits. This was fairly satisfactory, but we thought 
it possible to get a more accurate statement. This we com- 
puted from our table of scores for each test which were to be 
expected for different ranges of total score.” From the 
table we calculated the amount of credit to be expected on 
tests 4 and 6 for each of the ranges of total score. We then 
constructed a table giving the amount that should be added 
for each total score obtained when the two tests were omitted. 





1° This table was published with some printer’s errors, (later cor- 
rected) in the Journal of Abnormal Psychology, XIII, 1918, p. 77. 











56 FOSTER AND TAYLOR 


We later made similar tables of corrections for omission of 
tests 14 and 18 (lack of education) and for tests 1, 2, 3(a), 
7, 11, 12, 16, and 18 (total blindness). The corrections for 
lack of education were adopted by the Division of Psychology 
in the Surgeon General’s Office for use in the examination 
of illiterates. The corrections are given in Table 6. 


TABLE 6 


CORRECTIONS FOR POINT SCALE NORMS WHEN CERTAIN TESTS ARE 
OMITTED 





When 4 and 6 When 14 and 18 When 1, 2,3 (a), 7, 11, 


are Omitted are Omitted 12,16, and 18 
; are Omitted 

(Deafness) (Education) (Total Blindness) 
For Scores: Add: For Scores: Add: For Scores: Add: 

13-25...... 5 eRe 0 Wee AN: ll 

26-60 6 52-58 2 14-15 15 

61-78...... 7 50-62....... 4 * a 16 

79 8 63-69 6 22-28 17 

80-91...... 9 70-74....... 8 RA 18 

75-77 9g 35-39 21 

78-90...... 10 ere 24 

43-48 27 

49-50....... 29 

51-52 30 

pate. Sous 32 

54 33 

55-66....... 34 


With these tables as models, we proceeded to make a similar 
table to correct for advanced chronological age. We have 
found throughout, as we have said, that the older subjects are 
almost without excepticn poorer in tests 13, 14, 16, and 18 
than younger persons attaining the same total score. We have 
therefore supposed that these tests should be omitted in giving 
the examination to old people and have calculated the cor- 
rections for such omission. We do not mean that the tests 
should actually be omitted. On the contrary, if a person of 
over 50 years of age obtains a high score on the four tests, 
it is evident that he has not begun to lose certain abilities 
which many of his contemporaries have lost. In other words, 
in our opinion failure on tests 13, 14, 16, and 18 on the Point 
Scale means little or nothing if the subject is advanced in 

ears, while success on those tests may be very significant. 
The corrections which we offer tentatively for this group of 
advanced ages are given in Table 7. 














MENTAL TESTS TO PERSONS OVER FIFTY YEARS OF AGE 57 


TABLE 7 


CORRECTIONS FOR POINT SCALE NORMS TO BE USED WITH OLDER SUBJECTS 





When Tests 13, 14, 16 and 18 are 
Omitted 


t 

(Advanced Chronological Age) 
For Scores Add 
| ETS I TRE 0 
37-43 1 
SCG. . the bes » cabanas 3 
49-53 5 
Ts eee i) 
56-58 12 
Sos ss oso 5G Gab. ceo 13 
62-66 15 
SIDS |, niche to butte os 5 biatlh 17 
70-82 18 


At first thought it may appear that if we correct for failures 
which seem to be due to advanced age alone, we should also 
correct for successes which are apparently due to the same 
cause. Perhaps we should. If the idea were carried to its 
logical extreme we would be correcting for every test except 
1 and 20, the only ones in which the average score for young 
and old is identical. Such a procedure would, of course, be 
meaningless, and would amount to giving a mental age on 
the basis of two tests alone. Somewhere, then, we must draw 
the line between no correction and total correction. We con- 
sidered at first correcting for those tests in which one age 
gave an average score which was 120 per cent of the score 
obtained by the other age. This limit, however, would make 
us correct for 8 tests, in five of which the younger and in 
three of which the older were superior. Eight seemed such 
a large percentage of the total number (20) of tests that we 
were afraid we were again basing mental age on too few 
tests. If the limit were raised to 200 per cent, we would 
be correcting for only three tests, 13, 14, and 16. Test 18 
which came next on the list with the younger excelling the 
older by 136 per cent, was later included because the test is 
one which many of the older subjects dislike, and which they 
often cannot see to read. The actual limit used was, there- 
fore, 136 per cent. 


Conclusions 


1. There are certain definite changes in the distribution of 
scores on the Point Scale as the chronological age of the sub- 
ject increases. 














58 FOSTER AND TAYLOR 


2. These changes are evident in both normal and psychotic 
persons. 

3. There are three probable reasons for the changes: loss 
of ability, lack of practice, and absence of alertness or of 
interest in the older subjects. 

4. The mental condition of a subject over 50 years of age 
will be much more accurately presented if two mental ages 
are given: one which compares him with his own adolescent 
ability (or with that of normal young persons), and one which 
compares him with his normal contemporaries. 

5. A mental age which compares a subject with his normal 
contemporaries may be calculated from our Table 7. 




















SOME RESULTS AND INFERENCES DERIVED FROM 
THE USE OF THE ARMY TESTS AT THE 
UNIVERSITY OF MINNESOTA’ 


By M. J. VAN WaceENEN, University of Minnesota 


During the two academic years 1917-18 and 1918-19 the 
Army Test Alpha or its equivalent, Form E, was given to the 
freshman classes of nearly all the colleges at the University 
of Minnesota. In some of the colleges the results were used 
as an aid in diagnosing the causes of student failures. In 
other colleges the purpose was purely an experimental one. 
The use of the tests has revealed some significant information 
regarding the students of the various colleges. For instance, 
over eighty per cent of the student body of the University as 
a whole were found to come from the upper fifteen to twenty 
per cent of the population in general. With the exception 
of two of the individual tests the women of the several col- 
leges did just as well or even better than the men of the 
same colleges, but in these two tests—the range of informa- 
tion test and the arithmetic problems test—from seventy to 
seventy-five per cent of the men did as well or better than 
the median woman. Making correction for the excess of over- 
lapping due to the use of a single test, from sixty-five to 
seventy per cent of the men may be expected to do as well or 
better in solving arithmetic problems and to have as wide or 
a wider range of the kind of information called for in the 
Army Tests than has the median woman of the same college. 
Incidentally, the students of the College of Education were 
found to be a markedly superior group,—corresponding well 
with the highly selected group in the Medical School,—a fact 
which speaks well for the leadership of the teaching pro- 
fession. More surprising was the discovery that eighty per 
cent of the students of one technical college fell below the 
median student of another technical college in the abilities 
involved in the doing of such tasks as those making up the 
Army Test. And this is evidently a stable condition, for the 





1 Read before Section L of the American Association for the Ad- 
vancement of Science at the Annual Meeting, held in St. Louis, De- 
cember, 1919. 


59 














60 VAN WAGENEN 


differences of the second year were practically the same as 
they were for the first year the test was given. 

Interesting and valuable as such information may be to 
those in charge of the various colleges, nevertheless we are 
more concerned at the present time with the actual value of 
the Army Test and similar mental tests as a basis for predict- 
ing what kind of academic work will be done by each student 
during his college course, and especially with the further use 
that may be made of such tests as a basis of acquiring more 
definite knowledge of vocational aptitudes than merely the 
settling of the question as to whether or not a student is 
likely to succeed in completing a college course. 

During 1918-19 one of the graduate students in the depart- 
ment of educational psychology, Miss Judith Jacobs, worked 
out the correlations of the scores of the Army Test and also 
the correlations of the scores of each of the individual tests 
with the averages of the marks made by the women students 
of the College of Science, Literature and the Arts during 
each semester of their freshman year. The second Army 
Test given to the University freshmen was also given to a 
large number of sophomore women, most of whom had taken 
a similar test, Form E, the previous year. This made pos- 
sible a comparison of the students’ scores on the two tests 
taken some fourteen months apart and also the determina- 
tion of the relation of each of the two Army Tests used to 
the averages of the academic marks made by the women during 
each semester of their freshman year. It was necessary to 
confine this study to the records of the women students of 
the College of Science, Literature and the Arts, as no other 
college offered so large a group of records for the women 
and the work of the men had been too badly broken up with 
the advent of the S. A. T. C. to permit of any accurate study 
being carried on with the use of their academic marks. 

For 139 women students between the ages of eighteen and 
twenty, Miss Jacobs found a correlation of +-.39+.05 between 
the scores made in the Army Test, Form E, and the averages 
of the marks for the first semester of the freshman year, a 
correlation of +.35 +05 with the averages of the marks for 
the second semester, and a correlation of +.34+.05 between 
the scores made in the Army Test Alpha, Form 6, and the 
averages of the marks for the first semester of the previous 
(freshman) year. Miss Jacobs also found that the correla- 
tion between the averages of the marks for the first semester 
of the freshman year and those of the second semester was 
+.78 +.02, while the correlation between the Army Test, Form 














ARMY TESTS AT THE UNIVERSITY OF MINNESOTA 61 


E, and Alpha, Form 6, was +.82+4.02. The correlation be- 
tween the scores of the Army Test, Form E, and the averages 
of the marks for the first quarter of the sophomore year 
dropped to +.13+4.05. At the same time the correlations 
between the averages of the marks for the first and second 
semesters of the freshman year and the averages of the marks 
for the first quarter of the sophomore year fell to +.51+.05 
and +.52+4.05 respectively. The very low correlation between 
the Army Test scores and the averages of the students’ marks 
for the first quarter of the sophomore year is undoubtedly 
due to a very large extent to the errors of judgment in the 
assignment of the sophomore marks, since the quarter involved 
less than ten weeks of academic work. If this assumption 
is true then any increase in the accuracy of the judgments 
of the instructors regarding the quality of students’ achieve- 
ments should result in higher correlations between the Army 
Test scores and the averages of the academic marks. Later 
it will be shown that this was exactly the condition found. 

For the individual tests of the Army Test the averages of 
six correlations of the scores with the averages of the marks 
of one semester are as follows: 


Practical judgment scores and averages of marks for 1 semester r = —.09 
Oral Direction scores 4 " . 5 r=—.01 
Series completion scores ° ° * . As a r== +.15 
Disarranged sentences scores “ . . vl Hipeds rt r= +.21 
— scores : “ ™ | P r= 2 
ynonym-antonym scores 9» i ae ae r= +. 
Range of information scores “ . * ° (#2 6 r=+3)1 
Arithmetic problems’ scores “ . ~ AD . r= +.36 


The scores of two of the eight individual tests show practically 
no relationship to averages of the marks, while the scores of 
five show a correlation above +-.20, the scores of two of them 
—the range of information test and the arithmetic problems 
test—showing a correlation above +-.30. 

The relationships between the scores of the two Army 
Tests, those between the averages of the marks for the two 
semesters of the freshman year, and those between the scores 
of the Army Tests and the averages of the marks for each 
semester are shown in the following table more clearly per- 
haps than they are expressed by the coefficients of correlation. 
By first finding the standard deviation or sigma for the scores 
of the Army Tests and for the averages of the marks, and 
then finding in which of the ten one-half sigma units each 
individual score fell, we have the table below. It reads: Of 
the 141 students who took both Army Tests, 39 per cent 











62 VAN WAGENEN 


stood in the same one-half sigma unit or differed in their 
standing in the two tests by less than one-tenth of the range 
of the scores of the college students; 41.9 per cent differed 
by not more than one one-half sigma unit or by less than 
one-fifth of the range of the scores; 15.6 per cent differed 
by not more than two one-half sigma units or by less than 
three-tenths of the range of the scores; 2.8 per cent differed 
by not more than three one-half sigma units or by less than 
four-tenths of the range of the scores; while only 0.7 per 
cent differed by as much as four one-half sigma units. 


TABLE I 


‘orm Grades of Averagesof Averages of Averages of Averages of 
in one-half 6 Ist Semes- Grades of Grades of Grades Grades of 
Sigma Units terandof ist Semes- 2nd Semes- lst Semes- 2nd Semes- 

me See ster ster ster ster 
No. of Cases 141 224 241 235 105 105 
° Difference. ..... mo = 34.4%...... 19.5%...... 19.2%...... 15.2% a Ps Poets 
2 1. 11+ 15.6%......18.7%......17.4%......19.6%......21.9%..... 22.9 
. Soe ee eee 2.8%...... |, eer ae 15.7%...... 13.4%...... 17.1 
+5 2 we psy AX Se ibeda ces ) , eee 8 ie pe i adukea 48 
Set Woke. beeen ~ «Shih 4.2%...... ..) ee a 2.8 
Be ee ee ne oe alk each hes 0.9%...... ewan bss eei0c 0.9%.. 


Although two of the individual tests of the Army Test 
contribute practically nothing and a third but little toward 
prognosticating the academic record of a student, it is never- 
theless astonishing how much the Army Test, taking less than 
forty-five minutes to give, will tell about a student’s probable 
success in making academic marks during either semester of 
the freshman year. Over fifty per cent of the students will 
not change their standing in academic marks from their 
standing in the Army Test by more than one one-half sigma 
unit or by more than one-fifth of the range of the scores 
of college students or by more than a change from a D to 
a C or from a C to a B, as these marks are given to college 
students. Over twenty-five per cent of the students will make 
a greater change in their standing in their academic marks 
from the first semester to the second, even when the same 
subjects are studied and, for the most part, with the same 
instructors. As far as making academic marks in either 
semester of the freshman year is concerned less than fifteen 

r cent of the students will be misplaced by the Army Test 
by more than three one-half sigma units or by four-tenths 
of the range of the marks or by as much as a change from 
an'E Wh C or fetal o D> AE br Cid oC Wes A True 
it is that this is a rather large misplacement, but over two 

















ARMY TESTS AT THE UNIVERSITY OF MINNESOTA 63 


per cent of the students are displaced just as much in their 
academic marks from the first semester of their freshman 
year to the second semester. Then, too, were the academic 
marks of both semesters of the freshman year taken into 
account the amount of misplacement by the Army Test would 
be considerably less than here indicated. 

Equally significant with the determination of the amount of 
change that occurs in connection with the Army Tests and 
college marks is the determination of the direction of the 
changes that may be expected of students of different degrees 
of ability. For this study three groups of students were 
selected: those who stood in the second and third one-half 
sigma units or in the E and D— units, those who stood in 
the fifth and sixth one-half sigma units or in the C— and C 
units, and those who stood in the eighth and ninth one-half 
sigma units or in the B and A— units. The units referred 
to are the one-half sigma units or the respective tenths of the 
base line of a normal surface of frequency lying between 
—2.5 sigma and +2.5 sigma. 

The first table below (Table IL) and subsequent tables 
(Tables III to VII) read: Of the 23 students who stood 
in the second and third one-half sigma units from the bottom 
of the distribution or in the E and D— group in the first Army 
Test, 34.8 per cent did not change their position in the second 
Army Test as much as one one-half sigma unit or as much 
as the difference between an E and a D—; 39.1 per cent 
changed as much as one one-half sigma unit but not as much 
as two one-half sigma units or not as much as the difference 
between an E and a D; 21.7 per cent changed as much as 
two one-half sigma units but not as much as three one-half 
sigma units; while only 4.4 per cent changed as much as four 
one-half sigma units or as much as from an E toa C. While 
34.8 per cent of the group did not shift as much as one 
one-half sigma unit in either direction, 52.2 per cent made a 
higher position in the second test than in the first one and 
13 per cent made a lower position in the second test. Of the 
group of fifty-seven C— and C students 36.8 per cent remained 
practically stationary in the second Army Test, while 24.5 
per cent made a higher position and 38.6 per cent made a 
lower position. Of the 15 B and A— students 53.3 per cent 
remained in the same positions, 26.7 per cent attained a higher 
position and 20 per cent dropped to a lower position. 














64 VAN WAGENEN 


TABLE Il 


AMOUNTS OF CHANGE OF POSITION From ARMY TEST, FORM 


E, TO ARMY TEST ALPHA, 
Form 6, IN THE CASE OF THREE GROUPS OF STUDENTS 
one- 8th 


Amount of Difference in 2nd and 3rd one- 5th and 6th and 9th one- 
One-half Sigma ia ” half Sigma Units or half Sigma Units or half Fem | Units or 
eee Ee ae eo wom the B and A—Group 
of Students of Stude of Students 
SN i Sita ds cass tabi Sh > iste ane A omi'a'n abies 5 15 
IN ns a vc dt 8 a0 Bank aR’ ae DD. 64 ose evans 36.8% , ‘835 
og I PE a 8 RANE ee Soi’ és 4% 60nd +4 38D 
BAS Boren terme: oee I cid cot 15.8 - 6.6% 
3 2 cue lite + 06S eee 1.7% .. 66% 
4 4 Poh enasnc. a Qe chess 
Per Cent of Cases Falling 
in Higher Units in 2nd 
DR EEAS beats sinchintn ure « ed's nines bs ced 24.5% 26.7% 
Per Cent of Cases Falling 
in Lower Units in 2nd 
Test 5 ¢ RES cal LBC d eed «  )) Saee | 


The table (Table III) giving similar facts for the averages 
of the academic marks of the two semesters of the freshman 
year shows the same tendencies ; namely, the smallest amount 
of shifting occurs among the superior students and it is just 
as likely to be a shift to a higher as to a lower position. The 
greatest amount of shifting occurs among the students making 
the lower records and is more likely to be a shift upward than 
a shift downward. The dominant shifting tendency among 
the students of the middle group is a downward one in the 
second semester’s work as well as in the second Army Test. 


TABLE Ill 


Aasvemes er Guanes op Foseretn Snass Means ar yam Pings SEMESTER TO THE MARKS 
OF THE SECOND SEMESTER IN THE CASE OF THREE GROUPS OF STUDENTS 


Sein Smee” nai Seca" bene sit Sineysne he Bias oe 
one- ni ni nr nits 
tine Band BY ‘wcie'Ctang © tthe Band 


and D— nd A— 
- thnand of Students ‘= of Students Son of Students 
DTN. 0s bad 0% & iaeeke cheetas | SE Os tin oe a elie ie 22 
CO OOO OEP Ce eT Oe kh a dle wn 928 Rs adios o6 «04 mand 
ARES BF ORR ELS | ie ee |), SE ears Sbces eso tees. 45.4 
2 at i, beeepmmnte se ol esuea ten ES 784 Sere. ee 13.6% 
| ne Rennererereasereeeeetee —::....: er... eee 
4 Me dene teeeeeeeeeeeenereees tt aoophoieah Aah bet rents «smnae> 
She eS MCS (cet ie Ral ra ee 45% 
Per Cent of Cases Falling 
in Higher Units in 2nd 
Semester py Ce so xne cpa Wn Rass han SeRas ta 31.8% 
Per Cent of Cases Falling 
in Lower Units in = 
Semester eee — (a aS Ptr 31.8% 


The next four tables (Tables IV to VII) indicate in the 
same way the degree of stability and the amount and direc- 
tion of the shifting of positions from those achieved in the 
Army Tests to those made in the academic grades. 











a . : 
Et 
“p 
? 


Bees te 


ky 











ARMY TESTS AT THE UNIVERSITY OF MINNESOTA 65 


TABLE IV 


AMOUNTS OF CHANGE OF POSITION FROM THE ARMY TEST, Form E, TO THE MARKS OF 
THE FIRST SEMESTER OF THE FRESHMAN YEAR 


Amount of Difference in 2nd and 3rd one- 5th and 6th one- 8th on 9th one- 


one-half Sigma Units half Units half Sigma Units half Units 
or the E and D— or the and C or as and A— 
Group of Students y Hamed of Students Group of Students 
Ras 5 oe blo on 0550. cdeewes ie 05d... HY Ses Ge Sdechscts sate 30 
© FD. o's cin kone 08 seers eons ESS aa 0 SS 
1 6 7 eS lO RT eee. 23.3 
2 ee SS ESS et ape ehae S  Sabe bianca 20.0 
3 on os 8 - eR #5 ss cc Uiuw tis os is Glas aia oe 23.3 
AR eer eee ete inept 13.1%. .....222:/:'200@ 
5 MSS Fe een 24 a ved webs ones RT Sar ee FO Cer ee oe 


Per Cent of Cases Falling 
in Higher Units in First 
Semester’s Marks sae a cht eels oss wre: +) Serreee er, FS 
Per Cent of Cases Falling 
in Lower Units in Ist 
Se were 


TABLE V 


AMOUNTS OF CHANGE OF POSITION FROM THE ARMY TEST, Form E, TO THE MARKS OF 
THE SECOND SEMESTER OF THE FRESHMAN YEAR 


Amount of Difference in 2nd and 3d_one- 5th and 6th one 8th and 9th_one 
one-half Sigma Units half Sigma Units half Sigma Units half Sigma Units 
or the E and D— or the C— and C or the B and A— 
Group of Students Group of Students Group of Students 


Pe re eee Ge abdisskivcede JMtAdrosRabeeess? Ee 
0 Difference 12.8%.. 20.7%... 7.1% 
A: ae tater iape secede oe a 29.8%. 0 he ap 28.6% 
ite Sitios ager ete we 23.2%... °'"'''21.4@ 
eT mae rer ne Cope cee 6 ag Abs yl epabteneherds 178% 
4 yb bee oc tee i T3900... - 2s + - 10.7% 
ee la yg 2 eo a Cpe ae — eeeriteed  omespontes  " 
» FR Eat pl eine Sine aaa  omestgeaddeldy Ties pap muatehenE So we 


Per Cent of Cases Falling 
in Higher Units in 2nd 
Semester’s Marks Ce ee lle =) ae |. 
Per Cent of Cases Falling 
in Lower Units in 2nd 


Semester’s Marks co eee Oe ee 82.1% 
2: TABLE VI 
| AMOUNTS OF CHANGE OF POSITION FROM THE ARMY TEST ALPHA, Form 6, TO THE 


MARKS OF THE FIRST SEMESTER OF THE FRESHMAN YEAR 


Amount of Difference in 2nd and 3rd one- Sth and 6th one- 8th and 9th one- 
one-half Sigma Units half se Units half Sigma Units half Sigma Units 


or the or the C—and C or the B and A— 

os Group of Students Group of Students Group of Students 
eu eh hee i. 5 «5 £0285: 6b ad cen se} lis, dicts ofits cence ad teamed bis wxere 20 
4 | ll ee en eee ea See 
: 1 hal be $ CEPA Vee es 4b Ee = | RE ea” 35.0% 

se 2 7. b Peccied a takieendidide olela SMa ccic ss caste MET. ewes :. . . 10.0% 
a 3 ie ee PE ESE ES. (Sa . ae 15.0% 
4 lets OTD fe Oy ey! Seen x 15.0% 

6 PE Re a Diary s crcp.< SEAT ROAR dbase: hab ecoeee hens. . 5.0% 


Per Cent of Cases Falling 
in Higher Units in Ist 
Semester’s Marks Sab dag e obs oe MME nceeeé vc» « Ve De 
Per Cent of Cases Falling 
in Lower Units in Ist 
Semester’s Marks Dives ies thw 2k PAS TOS do aids Joa c 068 Ru seebe 80.0% 




















66 VAN WAGENEN 


TABLE VII 


AMOUNTS OF Goan or PosITION From THE Army Tests ALPHA Form 6, TO THE 
OF THE SECOND OF THE FRESHMAN YEAR 


Amount of Difference in 3rd one- Sth and 6th one- 8th and 9th one- 
one-half Sigma Units “hal i Sigma Units half Units half Si Units 
and D— or the and C oor the B and A— 

Group of Students Groupof Students Group of Students 


ON AES 4 Ae te Ba tS DE catch > UiuiGe.” ME psabaes Dee 19 
SOUND, .°. cd db ahic a s.ccou son cml 7.1%... 27.8%... .. 00 
epee rpm oF 36.1%... "36:8 
a SS Cae oe ee ache Cawed 22.2%... . 26.3 
3 eS ras 5 ale SUNS Oz bn a) Sa 21.4%... a3 PR aie 21.1 
4 eS Bee esas.» sper ice 7.1 OSS 2 ack wie See 5.3 
5 ee VES Ch xe abbase<0’ >-keiabtiee STIG. tas'ice ¢ ilies Sores SE laa screw xis WOM 10.5% 
Per Cent of Cases F 
in Higher Units in 2nd” 
Semester’s Marks eee, ee | eee eT! 
Per Cent of Cases Falling 
in Lower Units in 2nd 
Semester’s Marks owes & baht Fa Us k's ces cow ss CRIME cadens bes se ee 


The facts of these tables may be briefly and conveniently 
summarized as follows: the per cents of students making low 
scores in the Army Tests who stay in the same positions 
in their academic marks are relatively low, ranging from 7 
per cent to 29 per cent in the four tables. The per cents of 
low standing students making lower positions in academic 
marks than in the Army Tests are also small, ranging from 
7 per cent to 19 per cent; while the per cents making higher 
posi:ions in academic marks than in the Army Tests are 
relatively high, ranging from 60 per cent to 78 per cent. In 
the case of the middle group of students—the C— and C stu- 
dents—from 35 per cent to 45 per cent are likely to achieve 
a higher position in academic grades than in the Army Tests. 
Likewise from 35 per cent to 45 per cent are likely to fall 
to a lower position in the academic grades than in the Army 
Tests. The amount of shifting in this group is smaller, how- 
ever, than in either the superior or the low standing group of 
students. Not more than from 15 per cent to 30 per cent of 
the average students shift their positions in academic marks 
from those made in the Army Tests by more than the differ- 
ence between a D and a C or that between a C anda B. In 
the case of the superior students just the opposite tendency 
is found from that found among the students making the 
lower records in the Army Tests. Practically 80 per cent 
of those making the higher positions in the Army Tests make 
somewhat lower positions in their academic grades, while only 
from 0 to 20 per cent achieve higher positions in their academic 
marks than in the Army Tests. 





ARMY TESTS AT THE UNIVERSITY OF MINNESOTA 67 


This situation is highly significant in dealing with the use 
of mental tests for prognosticating a student’s probable 
academic success. It would seem that the tests prove least 
useful just where reliable results from their use is most 
needed ; namely, in eliminating those most likely to fail in their 
college work and in selecting for special groups those who 
are most likely to attain the higher degrees of success. When 
it is recalled that cf those standing low in the Army Test 
far more of them are likely to reach only slightly higher 
positions in their academic work it is not improbable that 
the extra amount of effort put into their work by the very 
low standing students may account in part for the large per- 
centage tending to attain higher positions in their academic 
work. Likewise the fact that the students standing in the 
higher positions in the tests tend to fall in somewhat lower 
positions in their academic work may in part be accounted 
for by a lack of maximum application to their work, espe- 
cially when there is not the external pressure to keep them 
working at their maximum effort that is exerted upon the less 
capable students. In addition to these factors some amount of 
shifting is undoubtedly to be accounted for by the large amount 
of outside work done by not a few of the students, and also 
by the insistent demand of the more serious social activities 
upon the time and effort of the more socially inclined indi- 
viduals. It is not improbable, too, that it is particularly diffi- 
cult for the instructors to select the exceptionally gifted 
college students and to accurately gauge their abilities and 
their achievements, just as for the elementary school teacher 
to select the exceptionally gifted pupils of her class has been 
found to be attended with great inaccuracy. 

There is yet, however, far too large a discrepancy between 
the positions attained in the Army Tests and those achieved 
in academic marks to warrant the use of the Army Tests for 
purposes of rigid selection. Two questions insistently arise 
at this point: First, are mental tests, such as the Army Tests, 
improvable to the degree that their use will be feasible in 
selecting and classifying college students? To this question 
an affirmative answer is undoubtedly warranted at the present 
time. Second, is it possible to improve the tests to the point 
of reducing the discrepancies within reasonable limits without 
at the same time securing more adequate measurements of 
the actual achievements of college students than is now afforded 
by the academic marks, which are based very largely upon the 
subjective judgments of the instructors? An affirmative 
answer to this question is reasonably doubtful, and this, too, 











68 VAN WAGENEN 


is one of the more important conditions tending to minimize 
the actual value of the use of mental tests in connection with 
the admission and classification of college students. 

Mental tests such as the Army Test are improvable for the 
end in view in two directions: first, additional tests showing 
a fair degree of correlation with academic marks and at the 
same time contributing elements not found in the tests already 
available may be included with those now showing some degree 
of relationship to academic marks; second, the reliability of 
the tests may be increased by improving the tests and by 
making them more extensive so that similar series of tests 
will give more nearly the same arrangements of scores for the 
individuals tested. The fact that nineteen per cent of the 
subjects tested by the two Army Tests showed a change of 
one-fifth or more of the total range of the scores for college 
students is very clear evidence of the need of more extensive 
tests—tests lasting a few hours instead of a few minutes. 
This is not a criticism of the Army Test for the purpose for 
which it was intended. To classify the upper twenty per 
cent of the population in small units is a much more difficult 
task than to classify the total population into relatively large 
units. Incidentally, one of the strangest phenomena in con- 
nection with the use of mental tests has been the demand 
of so many intelligent people in the field of education that 
the tests accomplish in a very few minutes what they them- 
selves willingly admit cannot be as well accomplished by any 
other means in any length of time. 

Just as increasing the length of the tests, other conditions 
being equal, increases the reliability of the tests, so increasing 
the number of judgments regarding the academic work of the 
students ought, other conditions being equal, to increase the 
reliability of the marks, and, at the same time, enhance the 
relationship between the tests and the academic marks if 
errors of judgment regarding academic achievements are in 
part responsible for the relatively low correlations between 
the test scores and the averages of the academic marks. In 
the case of eighty-four sophomore women for whom complete 
records were available the following coefficients of corre- 
lation were obtained when the scores made in the two Army 
Tests, Form E and Alpha Form 6, were correlated with the 
averages of the academic marks for both the first and the 
second semesters. 

Army Test, Form E and averages of marks for the two 
semesters r=+.46 +.06. 




















ARMY TESTS AT THE UNIVERSITY OF MINNESOTA 69 


Army Test Alpha, Form 6 and averages of marks for the 
two semesters r==+.50. 

For the five individual tests of the Army Tests that gave 
the highest correlations with the averages of the marks, the 
following average correlations were obtained with the averages 
of the marks for both semesters: 


Disarranged sentences scores and av. of marks for both semesters r= +.33 


Synonym-Antonym scores r= +.35 
Range of information ee Se sal eh wll . r= +.36 

scores “ vy uw cy a “ “ r= +-.37 
Arithmetic problems ’scores “ “ “ eee ae « r= +.41 


When the scores for these five tests were combined and the 
correlations between the two combinations and the averages 
of the marks for the two semesters obtained the following 
coefficients were found: 


Five tests of Form E and averages of marks for both semesters r= +.44 
Five tests of Alpha Form 6 and av. of marks for both semesters r == +.47 


From these results two facts are evident: the correlations 
between the Army Test scores and the academic marks were 
very noticeably raised—approximately one-tenth—when the 
number of academic marks used was doubled. The inclusion 
of the three individual tests showing the lower correlations 
raised the correlations for the group of tests but very little. 
The importance of the first fact is not to be overlooked, for 
it urgently suggests that the low correlations found between 
the Army Test scores and the averages of the marks for 
either semester of the freshman year, and especially the 
averages of the marks for the first quarter of the sophomore 
year, were in no small measure due to errors of judgment in 
the estimations of the achievements of students. Further- 
more, when the scores of the two Army Tests were combined 
the correlations between the combined scores and the averages 
of the marks for the two semesters was +.47. Also when 
the scores of the ten selected tests of the two Army Tests 
were combined into a single score, the correlation with the 
averages of the marks for the two semesters was likewise 
+.47. Thus it is clear that to obtain high correlations between 
the scores made in mental tests and averages of academic 
marks, it is essential to have more accurate measures of the 
academic achievements of students than the subjective esti- 
mates of the instructors. This of course does not imply 
that it is either unnecessary or not worth while to increase the 
reliability and the scope of the tests. It does suggest, how- 

















70 VAN WAGENEN 


ever, that because of the inaccuracies of college marks as meas- 
ures of academic achievements it will not be feasible to ac- 
curately determine the full prognostic value of the tests. 

That academic marks are not at all accurate indices of 
achievement or of abilities acquired is not a sheer guess nor 
even just a matter of opinion. The composition ability of 
some ninety-eight sophomore students was measured by hav- 
ing them write two half-hour themes and then having the 
themes rated for general merit by three college instructors of 
English with the use of the Thorndike Extension of the Hille- 
gas Scale for Measuring English Composition. The correla- 
tions between the rhetoric marks of the freshman year and 
the quality of the themes were as follows: 


First semester rhetoric marks and qualities of compositionA r= +.45 


First semester rhetoric marks “ . . « B r= +.42 
Second semester rhetoric marks “ . * - A r= +.4l 
Second semester rhetoric marks “ * . . B r =+.39 


The correlation between the rhetoric grades for the two 
semesters was -+.68, while that between the two themes was 
+.49. Making corrections for attenuation, the correlation 
between the ability to make marks in freshman rhetoric and 
the ability to write English compositions was only +.72. This 
is only one measurement of the relationship in question and 
hence is not to be taken too seriously. There is no reason 
to suppose that rhetoric marks are less accurate than the marks 
in other coll subjects. With no higher correlation, how- 
ever, than +.72 between the ability to make academic marks 
and the ability to achieve results, one cannot expect very 
high correlations between scores in mental tests and academic 
marks, even though the tests have a high value for prognosti- 
cating probable academic achievement. This condition of 
course places upon the psychologist working to derive more 
adequate mental tests for college use an additional feature 
of uncertainty to cope with. 

The most important finding in connection with the use of 
the Army Tests at the University of Minnesota, however, was 
the revelation of distinct group differences among the several 
college student bodies. In four of the college groups to which 
the tests were given in the two successive years the personnel 
of the groups probably changed but little from the one year 
to the next. A comparison of the median achievements of 
these groups afforded some suggestive results. In comparing 
the positions of the medians of the eight individual tests 
of the two Army Tests in the case of the men and the women 

















ARMY TESTS AT THE UNIVERSITY OF MINNESOTA 71 


in the College of Science, Literature and the Arts the median 
men stood higher than the median women in five of the tests 
of Form E. They also stood higher in four out of the five 
corresponding tests of Form 6. In three of the tests of 
Form E the median men stood lower. In each of the three 
corresponding tests of Form 6 they also stood lower. Upon 
making a similar comparison in the case of the men in the 
College of Science, Literature and the Arts and the men in 
the College of Engineering, the median men in the College 
of Engineering stood higher in five of the tests of Form E. 
In four of the five corresponding tests of Form 6 they also 
stood higher. At the same time the median men in the Col- 
lege of Engineering stood lower in the other three tests. In 
two of the three corresponding tests of Form 6 they also 
stood lower. This lends support to the presumption that there 
is probably a much closer correspondence between the relative 
sizes of the medians of two corresponding groups of individ- 
uals in two similar tests than between the relative sizes of the 
medians for two different groups in the same or similar series 
of tests. Assuming this to hold true for the large group of 
male students in the College of Science, Literature, and the 
Arts, the medians of this group were taken as standards in 
all the tests and the per cent of each of several groups doing 
as well as or better than this standard was determined for 
each test in the two groups of tests, Form E and Form 6. 
In so far as the assumption does not hold true, obviously, the 
effect would be to lower the correspondence where the corre- 
sponding groups took two different series of similar tests. 

After the per cents of overlapping for each test for each 
of the available groups—the College of Engineering students 
and the College of Dentistry students—had been ascertained, 
the tests in each group of tests were arranged for each college 
group according to the amounts of overlapping. When the 
correlations were then determined between the various arrange- 
ments it was found that the coefficients for each two different 
college groups, derived by Pearson’s shorter method, were 
+.07, —.04, +.39, and —.10 respectively, but that when the 
relationships between the two arrangements for the two sets 
of tests for the corresponding groups of the same college were 
determined the coefficients were +-.71 for corresponding groups 
of one college and -+-.68 for the corresponding groups of the 
other college. This means that the second freshman class 
tested in the College of Engineering stood high in the same 
tests as did the first freshman class, also low in the same 
tests as did the first freshman class tested. The same con- 














72 VAN WAGENEN 


dition was found to hold true for the two freshman classes 
tested in the College of Dentistry. On the other hand, the 
freshman classes in the College of Dentistry did not stand 
highest in those tests in which the freshman classes of the 
College of Engineering did, nor did they stand lowest in those 
tests in which the freshman classes of the College of Engi- 
neering stood lowest. In fact, there was practically no simi- 
larity or relationship evident when the freshman classes of 
the different colleges were compared with one another as is 
shown by the average of the four correlations above; namely, 
r==+.08 +.05. 

These indications strongly suggest that in so far as students’ 
interests tend on the whole to lead them to seek the occupa- 
tions for which they are the better fitted by nature, mental 
tests may in time be used not only as a means of predicting 
a student’s probable chances of success in college work in 
general but also as a basis of predicting his chances of success 
in doing the work of each of the various technical colleges and 
at the same time as a basis for giving the student more 
accurate advice in the matter of selecting a vocation than can 
now be given. This task will be long and arduous, and one 
demanding the fullest degree of co-operation on the part of 
educational leaders, especially those in the universities, but 
the results will surely repay for the efforts many-fold. 














| 
i 











PROPHESYING ARMY PROMOTION 


By S. C. Kons, Psychologist to Court of Domestic Relations, Portland, 
Oregon, and K. W. Irie, Reed College, Portland, Oregon 


The human mind has manifested from time immemorial 
an insatiate hunger for those phenomena and traits which 
could answer the question, no matter how indirectly and in- 
adequately, “What may I expect on the morrow,—next 
month,—next year?” Our star-gazers, palm-readers and our 
bump-feelers are still with us in spite of persistent efforts 
on the part of scientific criticism to discredit them, and the 
intense eagerness to know that which the future may hold in 
store for us has increased in the last few decades to immeasur- 
able proportions. 

It is not surprising that many have turned with hope toward 
psychology as the analyzer of individuality, of traits, of cus- 
toms, of events; toward psychology as a synthesizer of more 
or less fragmentary mental evidence for the purpose of more 
clearly indicating or prognosticating what one might expect 
in the future, granting that a given set of causes would in- 
evitably be active. 

This tendency toward developing a prophetic procedure is 
taking more concrete and definite form in connection with 
intelligence testing than with any other phase of experimental 
psychology. Thus we note one study on “ The Relation of 
Mental Testing to School Administration, with Special Ref- 
erence to Children Entering School,’ in which one of the 
objects was “to offer predictions as to the probable advance- 
ment of each child through the primary grades,—this prediction 
being based primarily on the potential mental capacity as shown 
by the Binet mental test.” (p.4). A similar study is now 
under way at Stanford University by W. M. Proctor, dealing 
with the prediction of progress of students through the sec- 
ondary school. Terman’s book “The Intelligence of School 





1 By Virgil E. Dickson, in Normal Seminar Bulletin A, No. 1, July, 
1917, Dept. of Educ. State Normal School, Cheney, Wash. See also 
= a The Intelligence of School Children, Houghton Mif- 

in, 1919. 


73 














74 KOHS AND IRLE 


Children” is a splendid contribution to this general subject. 
The results of these investigations lead one to a very optimistic 
outlook regarding the possibilities of prophesying school pro- 
gress. The psychological work in the army was largely con- 
cerned with predicting whether a raw recruit showed promise 
of being an asset or a liability to the army organization, and 
if an asset whether as a leader or a follower. In its pamphlet? 
the Psychological Division of the Army presents ample proof 
that the Army Intelligence Tests have been an efficient prophe- 
sying agency. And, finally, Terman is accumulating data 
which seem to point toward a constancy of relationship between 
life age and mental age, which if it is proven to exist will 
materially assist the prophecy of success or failure of an 
individual in various lines of activity. The increase of 
interest and research in trade tests and special aptitude tests 
will add data of inestimable value making for greater efficiency 
in selecting promising material for special lines of endeavor. 
The present study is an attempt to determine to what extent 
Reed College could have predicted the progress of 116 of its 
students who entered the service of the army or the navy. 
The data upon which such predictions might have been based 
would have been (a) the quality of their college work, and 
(b) faculty estimates regarding (1) their physical qualities, 
(2) their intelligence, (3) their leadership, (4) their personal 
qualities, and (5) their general value to the service. 


The Original Data 

The material for investigation came from three distinct 
sources. 

Source A. RATINGS. Three judges, members of the 
Reed faculty, rated each of the 116 students by means of 
Scott’s Rating Scale.* Information was thus obtained regarding 
the faculty’s estimate of each of these men on physical quali- 
ties (3 points the lowest, to 15 points highest), intelligence 
(3 to 13). leadership (3 to 15), personal qualities (3 to 15), 
and general value to the service (8 points the lowest, to 40 
points the highest). 

Source B. MARKS. The college marks were obtained 


2 See Army Mental Tests: Methods, Typical Results and Practical 
Fhe re Nov. 22, 1918, p. 23. Wash., D. C.; L. M. Terman: “ The 

se of Intelligence Tests in the Army,” Psychol. Bull. 1918, 15, pp. 
177-187; “The Measurement and Utilization of Brain Power in the 
yo Science, 1919, 49, pp. 221-226, 251-259. 

® War Dept. Adjutant General’s Office. Forms CCP-1102, 5-18-18 
and CCP-1104, 5-22-18. See also “ The Rating Scale,” Psychol. Bull. 
1918, 15, pp. 203-206. 


























PROPHESYING ARMY PROMOTION 75 


from the Registrar’s office and were grouped under three 
headings, as follows: 

“Natural Science” included marks in biology, chemistry, 
mathematics, physics, psychology, astronomy, geology and 
natural ‘science. 

“ Social Science ” included marks in economics, education, 
history, politics, political science, sociology, philosophy and 
social ethics, and 

“ Languages ” included marks in English, Greek, Latin, 
Romance Languages, Germanic Languages, and for want of 
a better place to insert them, art, surveying and mechanical 
drawing. 

The writers were aware that these groupings were artificial 
and arbitrary, but some grouping was necessary and these are 
perhaps as free from serious objections as any. 

The following table presents an analysis of the number of 
marks in each subject-group possessed by various frequencies 
of students; thus, 7 students had only one mark in natural 
science, 13 students had only one mark in social science, 16 
students had 4 marks in the language and fine arts subjects, 
etc. : 














No. oF STUDENTS HAVING No. oF STUDENTS HAVING 
IN MARKS IN 
No.of Nat. Soc. Lang. Total || No. of Nat. a, i, Total 
Marks Sci. Sci. and F. (Marks Sci. and 
Arts Arts 
1 7 13 6 26 14 1 2 0 3 
2 25 9 22 56 15 5 4 1 10 
3 7 G 2 18 16 2 2 1 5 
4 14 7 16 37 17 4 3 2 u 
5 4 ll 5 20 18 0 1 1 2 
6 9 ll 8 2 |! 19 1 1 1 3 
7 4 7 & 19 20 1 1 0 2 
& 2 5 10 17 21 2 0 1 3 
9 6 3 7 16 22 0 0 0 0 
10 3 6 10 19 23 0 1 0 1 
ll 2 1 2 5 24 0 0 0 0 
12 0 5 3 8& 25 0 | 0 1 
13 2 0 2 4 
Total 101 103 108 312 














Source C. ARMY RANK. The college attempted to keep 
a careful record, brought up to date, of the progress which 




















76 KOHS AND IRLE 


each of its students was making both in the army and in 
the navy. The ranks were those which each held the day 
the armistice was declared. The ranks ranged from Private 
to Major. The following table and graph present the fre- 
quency of each rank. 

















50- 
Numeer in 
“”” Army Rawics 
q 
ae 
2 2 & 
ho- 24 
12 15 
4] a a a Si, 
































"bt © eoeweetrtrtuewtew 
Arny Ranxs 


TABLE SHOWING THE FREQUENCY OF EACH MILITARY RANK 


Rank No. Rank No. 
1. Major 1 6. Bat. Sergt.-Major 1 
; = Ist. Lieut. = 5 Sergt. ® 
f nsign . Corporal 
4. Reg. Ret, Sergt Major 2 9. 1st. Class Private 1 
5. Ist. Sergt 1 10. Private 53 
Total 116 


Statistical Treatment 

The statistical treatment divided itself into seven portions: 
(1) the inter-correlations between the estimates of the various 
judges, (2) the correlations between the consensus of opinion 
of the judges regarding the five traits and final army rank, 
(3) the correlations between the five traits, (4) the correla- 








PROPHESYING ARMY PROMOTION 77 


tions between the five traits and the marks attained in the 
various subject-groups, (5) the average marks attained in the 
different subject-groups by each of the ten army ranks, (6) 
the correlations between final army rank and marks obtained 
in the three school subject-groups, and (7) the average ratings 
in each of the five traits received by each of the ten army 
ranks. It would have been valuable to have obtained an esti- 
mate of the reliability of each judge, but it would have involved 
time and labor which we could not very easily request. How- 
ever, from similar ratings in the army and from past informa- 
tion regarding the reliability of judges in estimates of this 
nature, we may take for granted a rather high degree of relia- 
bility of judgment. This is evident if we note the inter- 
correlation between the ratings of the various judges, espe- 
cially between judges 2 and 3. 

In view of the fact that the length of service varied from 4 
months to 18 months it was felt necessary to divide the 116 
students into two groups, one consisting of short-service men 
and the other of long-service men. In the latter group were 
placed all who had been in the army or navy more than 10 
months. This division was important because the long-service 
men would presumably have had a better opportunity to obtain 
promotion. In fact, the real test of the validity of various 
prognosticating criteria will depend upon how accurately these 
criteria have “ sized-up ” this group of long-service men. The 
short-service men number 26, the long-service men 90. 

The correlation-coefficients mentioned herein were all ob- 
tained by means of the formula: 


Lfxy 
= V/Sie + / Sty: 


modified by one of the writers (S. C. K.) to correct some 
factors which make for too great a deviation from the true 
Pearson “ r-formula :’”* 





rey 
No,c; 
Results 
(1) The correlattons between the estimates of the various 
judges. 
(a) Judge 1 and Judge 2. 
Judge 1 was not as reliable as either 2 or 3, owing to his 





*Discussed more fully by one of the writers (S. C. K.) in his forth~ 
coming monograph on the “Block Design Test.” 

















78 KOHS AND IRLE 


being somewhat pressed for time in connection with war- 
service matters. The correlations between the estimates for 
the five traits were: (Coefficients are positive unless otherwise 
indicated), (the probable errors of each coefficient are given 
in the summary table at the end of this article) : 


Sey SE: OOD © 4 sis heeSi 5. loved. Res cess. dads 34 
Se ID ihiesd 5d. whdiawetlin.s vthidnd «epinth Vabaeens C13 40 
EL, dntind ts Bink atddics ahtbiee sanapeitindiaksemes 37 
rn Wns 000. ls ou eneMonah cchtbese os 39 
Te) CU Oe Gi I nin eek ov cdde eee vest oe 34 


(b) Judge 1 and Judge 3. 
The correlations between the estimates of these judges were 
somewhat better, though still unusually low: 


3 physical qualities .....i6..cccccceccecdcccceccceccenes 53 
Se IEE has 4.5.0 £0 W660 00 Aah ba dein d.nde cade ene 29 
[i EE sihin css tha ehagbambons oastantc e¥bankae cue 50 
COP MP MOIENOD © Sick bake dhe cdscancudcccdabedbenss 38 
Ce WED ee a £55 io BEEN on ME RST oO 30 


(c) Judge 2 and Judge 3. 

The estimates of these two judges may be regarded as 
quite reliable. The correlations between their judgments 
follow : 


3 EEO, 6.0 caancntnhas gurecide aor ob-eind 4 
St hich ds cus bem occ debe cheat des @hekes 

Ge SEE Cac bd Uae odds vert dudsogds shed vacedsveees 65 
CED: I i ies sic kcdhoovccwnbevcecedbbees 62 
CED CR SP a open «dine on bt ne teh ctwwadvnsevee 72 


The estimates of physical qualities showed the lowest cor- 
relation, whereas those of value to the service showed the 
highest correlation. 

The results of this analysis of the ratings of the three judges 
showed rather mediocre reliability of capacity to judge a 
student’s rank with regard to any one of these traits. Of 
course, one of the judges may have been quite expert, but the 
bare data themselves do not reveal such expertness. In fact, 
an analysis of the correlations between final rank vubtained 
in the army and the average trait estimates of the three judges, 
in the next division of our study, brings into still greater 
question this subjective method of prognosticating progress in 
the Army. Averaging the correlations of the estimates of each 
of the three judges for the five traits we obtain the following: 


CR) TD ox cece cccccusve chs crunttindenes babekens AO 


CO) SE ND 0s, od ncesginetnnt eke one ationn* Yer o 43 
033 ee ME ecw eC vac casneca hehe hesereseon. 4S 
(4 pecoeee! CIEL: oditing tbh eerste Reins <unan doh abe 46 
[2D GED. «a vhen< Co nnsientbedecthtdete esd +heesersete 51 











PROPHESYING ARMY PROMOTION 79 


It is interesting to note that the judges showed least agree- 
ment on “ intelligence” and most on “ leadership.” 

(2) Correlations between the consensus of opinion of the 
judges regarding the five traits and final army rank. 

(a): Short-service men. 

The correlations between the ranks which these men had 
attained during their short-lived stay in the army and the 
estimates of the judges of their capacities in each of the five 
traits enumerated, are as follows: 


(1) physical qualities 


Sr eer eee ee Sos ee 21 
Ce SN Mia Geb iade ds de etscebsovctsvcccedpuotsigs 39 
REGED ENT PUT TTS TOTP COT TT eT eT TT eee 08 
RET A ee 23 
Se Ee MN ons cen ne oc see eeectaneseesshs 30 


Of course, one might argue that these results clearly indi- 

cate that because these men were in the army but a short 

period, for this reason these correlations are low. However, 

the correlations are just as low for the long-service men. 
(b) Long-service men. 


(1) physical qualities 


Bo cu telenewalbuaech A be< daehie oo 28 
ESA, Lohans dn baccdesddtbvassecddbu 14 
ee a Aine oh eaeai Nes «0d-6 bn 4 dix Swe oleate 14 
ye a ae pe me 21 
Se ei cs kav odccwecicdéccegnccuodesne 33 


Although estimates of “value to the service” and final army 
rank show the highest or next to the highest correlations in 
both groups, this is not true for estimates of “ intelligence,”— 
which is highest in its correlation with final army rank in the 
short-service group, and lowest in its correlation with final 
army rank in the long-service group. It cannot be said that 
“intelligence ” becomes less of a factor for success the longer 
one remains in the army. But here again one questions the 
value of subjective estimates of amounts of trait manifested 
by various individuals. However, there is the other alterna- 
tive: one might maintain that our judges are expert and that 
the explanation for these low correlations is the haphazard 
fashion in which the army attacks the problem of advancing 
its men or that it considers other factors, not here enumerated, 
the important ones for success. 

(3) The correlations between the five traits. 

The estimates of all the three judges were massed for the 
purpose of this inquiry. 

One looks forward with the greatest anticipation to that 
study which will attempt to analyze, psychologically and statis- 
tically, the factors entering into the moulding of human judg- 














80 KOHS AND IRLE 


ments. What are the influences, conscious or unconscious, 
which make for idiosyncrasy, what is the mechanism of judg- 
ment, what makes for constancy, for reliability, how easily 
are judgments changed, what are the subjective criteria upon 
which judgments are based? The present portion of our in- 
vestigation is somewhat apart from our main thesis. Yet 
there is one striking feature of the data which may be sug- 
gestive to someone interested in this branch of research. There 
has been a good deal written regarding the Spearman-Hart- 
Burt explanation of intelligence as a “ general common fac- 
tor.” Evidence to that effect is deduced by a statistical analysis 
of the results of various psychological tests each of which 
measures some amount of this “general common factor.” 
The “ Hierarchy of Coefficients” lends added weight to this 
hypothesis. We may here draw an analogy. It seems very 
probable that when one is passing a subjective judgment on 
the question of whether Person A possesses a certain amount 
of trait a, or b, or c, or d, that his judgment of practically 
all of these is affected by some constant factor x. For ex- 
ample, here is Tom Jones. Bill Smith is requested to record 
a personal estimate of his character, habits, self-control, in- 
telligence, sociability—whether excellent, good, fair, very 
poor. What probably occurs when Bill estimates, is that each 
of his judgments is affected by a constant factor, possibly 
unconscious, such as “ Tom Jones is an excellent fellow. I 
like him because his ideas are very attractive to me.” This 
example is not typical, of course, but is merely utilized to 
illustrate the point. That some such condition possibly exists 
is suspected from this table of data, which shows, somewhat, 
a hierarchy of coefficients. 


Fs 38 
re) % a = £ 
r 3 
§ g 
e 3 E s 
a. 2 .. 
~ i) Ge) w 
1. Physical Qualities.......... .27 54 45 .63 
PCE. Ft. oe Ss. .27 52 56 .62 
EN En ree 54 .52 .60 Rg 
4. Personal eer 45 56 .60 80 
5. Value to ee .62 71 .80 


It is of interest to note (a) that the smallest correlation is 
between physical qualities and intelligence (.27), and the high- 











ri 
u 





PROPHESYING ARMY PROMOTION 81 


est between personal qualities and value to the service (.80), 
(b) that physical qualities and intelligence show the least cor- 
relations with the other traits whereas value to the service 
shows throughout higher correlations. 

(4) The correlations between each of the five traits and the 
marks attained in the various subject groups. 

The question, “ To what degree do school marks and the 
trait estimates of judges coincide when prognosticating army 
success,” is answered by the following correlations: 


Physical Intelli- Leader- Personal Value to 


qualities gence ship qualities the Service 
Marks in Nat. Sci. .15 Pr i | 40 40 38 
Marks in Soc. Sci. 34 57 Al 44 46 
Marks in Lang. 
andFine Arts  .30 66 36 33 38 


The estimates of the three judges were averaged (Arith. 
Mean) in determining these correlations. It may be of in- 
terest to note that physical qualities showed the least corre- 
spondence with school marks, whereas intelligence showed 
the highest. In other words: Of all the five enumerated 
traits, intelligence was the most important in determining suc- 
cess in school work, whereas physical qualities, although im- 
portant, apparently were the least important of all. There 
is that much to the credit of the faculty and the grading 
system ! 

There may be some who would be interested in a further 
analysis: It seems from these data that intelligence is more 
of a factor for success in the natural sciences, /east in the 
social sciences, with languages occupying a middle zone! 

It is evident from the above table that with the exception 
of intelligence, school marks inadequately coincide, if at all, 
with the judges’ ranking of these men in order of their abili- 
ties. The two criteria, apparently, are separate,—and, apart 
from intelligence, measure a somewhat different array of 
characteristics. 

(5) The average marks attained in the various subject 
groups by each of the ten army ranks. 

Since the short service men had representatives in only 
three ranks, private, sergeant, and second lieutenant, and be- 
cause of the difference in service time, the short and long- 
service men will be considered separately. The following tables 
present the number of men of each rank in the two groups: 











82 KOHS AND IRLE 
SHORT SERVICE MEN LONG SERVICE MEN 

Rank No. of Men Rank No. of Men 
3. 2nd. Lieut. (Ensign) 4 1. Major 1 
7. Sergeant 2 2. Ist. Lieut 12 
10. Private 20 3. 2nd. Lieut. (Ensign) 20 
-- 4. Reg. Sergt.-Major 2 
Total 26 5. Ist Sergt. 1 
; oa tea ul 
8. 8 
8. lor Chass Private 1 
10. Private 33 
Total 90 


In considering the long service men, only ranks (2), (3), 
(7) and (10) will be compared since the number of men in 
the other ranks number no more than one or two. 

(a) Short-service men. 

Before presenting an analysis of the marks attained, a few 
words might be said in preface regarding the Reed College 
marking system (from the current Reed College Catalogue) : 

“Grades in courses of study are awarded on a scientific 
rather than a personal basis, with definite credit for quality as 
well as for quantity of work. Until all school work can be 
measured by scales, made up of units that are equal in a de- 
fined sense, the best available grading is one of relative 
position in a series. The nearest approach to such a scientific 
basis for awarding college credits appears to be a distribution 
following the normal probability curve, skewed to take account 
of the effect of selecting the student body. 

“ Reed College has, from the outset, used ten grades, whose 
definitions have such a scientific basis. 








See. 763 PA s ie 1o-441e10-|7 10 |e Ite 





ion of 
ee paaink . aN ..-+| 5%j10%|15%\20%i25Gji1l5%i 6%) 4%).... 






































“ Grades 1-5 indicate that a student stands in the upper half 
of an average class ; grades 6-10 indicate that he is in the lower 
half. For example, 2 designates the work which will be done 
(in the long run) by the best 5% of all students, and 6 the 











PROPHESYING ARMY PROMOTION 83 


work done by that quarter of an average class standing just 
below the middle. 

“ Grade 1 is rarely given, representing a degree of excellence 
attainable by not more than one student in four or five hun- 
dred; grade 10 records correspondingly bad failures. The 
lowest passable grade is 8; 9 is for ordimary cases of failure. 
The grades cannot be interpreted in qualitative terms, as good, 
poor, A, C, 90%.” 

In the following table are presented the average school 
marks of the three different ranks: 


AVERAGE MARKS IN 
Rank Nat. Soc. and 
Sci. Sci. Fine 
3 48 5.9 5.2 
7 5.7 6.2 6.1 
10 5.3 5.0 5.8 


Rank (7) (Sergeants) had, on the whole, poorest marks 
throughout. Rank (3) (2nd Lieut. Ensign) was clearly 
superior to (10) (Private) on the basis of marks as one would 
naturally expect, this being especially true for marks in natural 
science. It may or may not be surprising that the evidence 
was reversed for marks in social science, rank (3) being 
inferior to (10). Summarizing the data of the table it may 
be said that those of higher rank obtained higher marks in 
the natural sciences and languages than those of lower rank, 
the situation being reversed for proficiency in the social 
sciences. 

(b) Long-service men. 

In the following table are presented the average school 
marks of the four different ranks: 


AVERAGE MARKS IN 


Rank Nat. Soc. La and 
Sci. Sci. Fine 
2 5.3 49 5.6 
3 6.1 5.5 6.3 
7 5.1 5.4 6.1 
10 6.1 6.0 5.9 


_ Rank (2) is, throughout, superior to rank (10) especially 
in the social sciences. And as a medium of prognostication 
these subjects would seem to have the advantage over other 








& 
x 





84 KOHS AND IRLE 


subjects in the college curriculum, this being the direct oppo- 
site of the indications apparent in the case of the short-service 
men in the previous division. Rank (7) was, on the whole, 
superior in marks for each group over rank (3). A matter 
worth mentioning is the variability of marks within each rank. 
A glance through the various school marks which the mem- 
bers of the different groups had obtained reveals as great 
an average variability of marks within each rank as is evi- 
dent between the averages of the different ranks. It is ques- 
tionable whether, except in extreme cases, one can rely on a 
school mark in any subject as an aid in prognosticating army 
promotion. This matter is more emphatically brought to our 
attention in the next item to be considered. 

(6) The correlations between final army rank and marks 
obtained in the three school subject groups. 

The correlations for the short and the long-service men 
are presented separately. 


SHORT-SERVICE MEN 
Correlation Beeess | Final Rank and 


ks in bias em 
_ — = 
eae oa and Fine Arts 06 


These correlation coefficients are so small that little if any 
diagnostic significance can be attached to the school marks 
for this group of men. 


LONG-SERVICE MEN 
Correlation Between Final Rank and 
Marks in Co-efficient 
a. —— = 
pode and Fine Arts 01 


The coefficients here are also small, but as in the previous 
section, marks in social science for the long-service men seem 
more reliable criteria for prognosticating army promotion, but 
even then, their value is greatly limited because of the many 
exceptions. 

On the whole, school marks, although they might aid prog- 
nostication, cannot be depended upon for sole support in this 
effort. In fact, poor as judges’ estimates may be, they seem 
somewhat superior to school marks for diagnosing army 





























































PROPHESYING ARMY PROMOTION 85 





progress. This will be more apparent from the data in the 
succeeding section. 

(7) The average ratings in each of the five traits recewed 
by each of the ten army ranks. 


SHORT-SERVICE MEN 


Rank Physical Intelli- Leader- Personal Valueto Total 
Qualities gence ship Qualities Service 


3 12 14.3 10.3 13.5 33 83.10 
7 13.5 10 11.5 11.5 28 74.50 
10 11 11.2 10.6 11.5 27.8 72.10 























Although the separate traits with the exception perhaps of 
“value to the service” do not demonstrate any clear correla- 
tion between height in rank and height in trait-score, never- 
theless the totals do show this clearly. Comparing the high- 
est rank (3), with the lowest, (10), the former throughout 
shows higher trait-scores than the latter with the exception of 
“leadership.” “ Intellizence” and “ Value to the Service” 
show the greatest difference in trait-score between the highest 
and lowest ranks. This helps confirm the earlier claims for 
these two traits as more efficient for prognostication than any 
of the other three. 


LONG-SERVICE MEN 


Rank Physical Intelli- 
Qualities gence 


Leader- Personal Valueto Total 
ship Qualities Service 












2 11.9 12.5 12.2 12.9 34. 84.30 
3 12 10.4 10.7 11.5 29.7 74.30 
7 11.7 11.7 11.3 12.2 29.7 76.60 
0 10.4 10.1 10.3 10.9 26.2 67.90 





Here again rank 2, as well as rank 3, in each of the five 
traits, demonstrated considerable superiority over rank 10; 
especially is this true for rank 2. Again, with the exception 
of “value to the service” the separate traits do not show a 
clearcut correlation between height in rank and superiority 
in trait-score. It is of interest to note that rank 2, averaged 
more than 16 points higher in total score than rank 10. As 
was the case with school marks, so here, with regard to judges’ 
estimates, rank 7 is apparently superior to rank 3 in the opinion 
of the judges, and apparently the army has not utilized the 
same criteria in its judgment and placing of these men. 
















86 KOHS AND IRLE 








bi SUMMARY OF THE CORRELATIONS 
No. Cases Facts Compared Cor. P.E. 
,. oo Estimates of Judge I and i re Physical Qualities 34 .06 
2 i113 ? In ence 40 06 
2 +38 . Og eae ee i 37 06 
4 113 -~ SE ee Te ities .39 .06 
S oe . MOP BAC IR > Val. tothe Service 34 .06 
6 96 ° « “ “TIl Physical Qualities 53 .05 
7 76 ° = Syren Intelligence 29 07 
8 39 - fa. ae 50 08 
G 60 " , Seat ae ee Personal Qualities 38 07 
10 35 . . « « « ~ Val.tothe Service 30 .10 
i ta inigence $1 06 
“a “ “ oe “ nte ° e : : 
13 39 « a; Spe, a i 65 06 
14 60 y- p AERTS. BE, Personal 62 .06 
15 35 ss ie eat Sai 72 06 
16 26 Est. of Judges for Phys. Qual. and Rank 1-9 mos. .21 .13 
17 26 ¥. © Intellig’nce - tony SD. 12 
18 26 ° . Leadership “ s $4 08 13 
19 26 * . Pers. Qual ? . Re. . ee 
20 26 - . Val. to Ser. “ ' ue” A 
21 90 “ r Phys. Qual. . “ 10-18 * 28 06 
22 90 3 ~ Intellig’nce “ . SF 14 07 
23 90 e » age of ¢ . ets 14 07 
24 90 . e Pers. Qual. “ . es. 21 .06 
25 90 of ‘ Val. to Ser. “ ¥ iitenes de. 
26. 21 Nat. Sci. marks Se. st Sia 12 15 
27 23 Soc. Sci. “ . i Pe S| ee 
28 25 Lang.F.A. “ OO Cok ee 06 13 
29 «80 Nat. Sci. 10-18 “ 14 08 
30 80 Soc. Sci. “* = - a 24 O07 
31 85 Lang. F.A. “ ° e ft O01 .08 
32 101 Nat.Sci. “ and Judges’ est. Phys. Qual. 15 07 
33 Me Soc.Se. * a ° 34 06 
34 108 Lang. F.A. “ am . . ” 30 .06 
35, 100 Nat.Sca. “°° “* * “ Intelligence 7 
36 103 Soc. Sci. “ Pe q ° 57 04 
37 108 Lang. F.A. “ edi. « ° 66 04 
ae LE “ Leadership 40 .06 
39 103 Soc. Sci. “ {..* e . Al 06 
40 108 Lang. F.A.“ IR. a 36 6.06 
| ie Se: I “ Personal Qual. 40 .06 
42 108 Soc. Sci. “ PN s “ . 44 06 
43 107 Lang. FAS * * . _ P 33 © .06 
44 100 Nat.Sci. “ OF “ Val.tothe Ser. 38 .06 
45 102 Soc. Sci. “ nave « ° i ins A605 
46 108 Lang. F.A. “ — e . 52 38 .06 
47 116 Est. of 3 Judges as to Phys. Qual. and Intell’nce .27 .06 
SS ss * gi 2 “ Leadership .54 .05 
> vor , - “Pers. Qual 45 .05 
» mas ad pio " ¢ “ V.toSer. .63 .04 

























































PROPHESYING ARMY PROMOTION 


SUMMARY OF THE CORRELATIONS—Continued 





51 116 Est. of 3 Judges Intelli’ence and Leader 52 .05 
te wee 2 e - “ Pers. a 56 04 
Ss. im: * i . “ V.toSer. 62 .04 
ao im 9 Leadership “ Pers. ag 60 .04 
S Hs *¢ . * “ V.toSer. .71 .04 
Sa * ..* Pers. Qual. “ V.toSer. 80 03 


Conclusions. 

Summarizing the results of our analysis the following would 
be our conclusions : 

(1) School marks are rather inefficient instruments for 
determining whether a student will make good progress in 
the army. 

(2) Human judges, with all their frailties, are, on the whole 
more eificient prognosticators of progress than the school marks 
which students obtain. 

(3) Of all the criteria for prophesying success, the safest 
are, first, judges’ estimates of value to the service, and second, 
judges’ estimates of intelligence. In both cases, however, the 
correlations are low. If objective, rather than subjective, 
estimates of these traits had been used, the correlations might 
have been increased considerably. We already possess objec- 
tive instruments for measuring “ intelligence,” why not produce 
similar instruments for measuring “ value to the service ?” 

(4) We have taken for granted, throughout this study, that 
the army was correct in its selections. We recognize that this 
assumption is not wholly valid; the real fault may not lie at 
all with our criteria but rather with the current methods in 
the army by means of which men are selected for superior 
positions. 

(5) The low correlations should be therefore explained as 
being due in part to 

(a) Imperfections in our standards of judging individual 
efficiency, marks and the estimates of judges ; 

(b) Imperfections in the system of army promotion ; 

(c) Differences in the factors upon which the army bases 
its promotion and those upon which school marks and judges’ 
estimates are based. 

(6) Although school marks and judges’ estimates may serve 
as aids in prophesying army progress, other criteria must be 
relied upon for any satisfactory development of a prognosti- 
cating machinery. 









THE DEGREE OF PH.D. AND CLINICAL 
PSYCHOLOGY 


By Encar A. Dott, Psychologist, N. J. State Dept. Institutions 
and Agencies 


There is reason to protest against the use of the degree of 
Ph.D. as a sine qua non in the “ certification” of a clinical 
psychologist. It may have occurred to some who have enjoyed 
the discussions in this JouRNAL of what constitutes clinical 
psychological expertness that one certifies himself as a clinical 
psychologist by the nature and character of his work and 
the consequent reputation therefrom ensuing. 

There seems to be a general agreement, however, that the 
degree of Ph.D. is an essential. The argument is presumably 
based on two considerations (a) that the degree is a testi- 
mony of advanced work and scholarly accomplishment, and 
(b) that it is something of a guarantee of superior general 
mental ability. Against (a) it may be protested that the attain- 
ment of this degree in academic psychology is no guarantee 
of either information or ability in clinical psychology, although 
undoubtedly it is a desirable and perhaps necessary basis for 
clinical psychology. Against (b) it may be argued that while 
those who hold the Ph.D. may be willing to admit its selec- 
tive influence as a measure of general mental ability there will 
be others who might contend that clinical psychology demands 
specific technical ability which may be more or less independent 
of general ability at the Ph.D. level of general ability. 

It is one of the functions of a clinical psychologist to dis- 
cover the exceptional case. He specializes in individual dif- 
ferences. The diagnosis of ability is determined without error 
only when the standard employed excludes all in whom the 
trait is absent and also includes all in whom it is present. 
Applying clinical methods to the diagnosis of clinical psycho- 
logical ability, can it be said that the Ph.D. degree is valid 
as a fixed condition of clinical psychological ability? We have 
seen in the Army that such was not the case. The overlapping 
of men with the degree who were failures as clinical psychol- 

88 


























THE DEGREE OF PH.D. AND CLINICAL PSYCHOLOGY 89 


ogists and men without the degree who were successes was 
certainly too great to permit the use of the attainment of this 
degree as anything but a general criterion of clinical psy- 
chological ability. 

Some reasons for this are fairly obvious. Ability in the 
field of individual mental diagnosis is very largely a matter 
of specific training rather than general preparation. In the 
long run the man with a Ph.D. degree has the advantage in 
ease and rapidity of assimilating such training, but his attain- 
ment in general psychology is no guarantee of the specific 
preparation required. Moreover, a man of fair general in- 
telligence without the degree is able under good instruction to 
gain this specific ability independently of the Ph.D. degree 
(which is after all only certification for formal work in resi- 
dence at a college of accredited standing). 

Clinical ability is also founded on experience with clinical 
material under clinical conditions. This ability cannot be im- 
parted ex cathedra in the classroom. Hence a man with the 
Ph.D. degree who has specialized in the theoretical or academic 
considerations of clinical psychology is not qualified until 
he has served his “ interneship,” which is perhaps equally as 
important as the Ph.D. degree itself. 

Moreover mental diagnosis is nearly as much an art as it 
is a science. A successful clinical psychologist must have a 
successful “ clinical personality,” the “ clinical temperament,” 
the ability to obtain and maintain that rapport which we so 
frequently hear mentioned. 

We therefore maintain the following general propositions : 

1. Clinical psychological ability demands specific training in 
the several allied fields of mental diagnosis (such as physiology, 
psychiatry, anthropometry and education, for example) as 
well as general training in academic psychology. 

2. This ability is based on experience and specific training 
as well as on formal academic preparation. 

3. This experience and specific training may be obtained 
independently of the Ph.D. degree. 

4. Some men with the Ph.D. degree are failures as clinical 
psychologists and some men without the degree are successes. 
The extent of overlapping is not inconsiderable. 

5. Therefore, while the degree of Ph.D. may be a desirable 
adjunct to clinical psychological ability, it is not a necessary 
prerequisite. 

In conclusion it is well to emphasize that we do not favor 
dispensing with the Ph.D. as a@ qualification for certification. 














90 DOLL 


Undoubtedly many Doctors of Philosophy have become psy- 
chological clinicians, just as many clinicians have become Doc- 
tors of Philosophy. But it is important to emphasize that a 
candidate for certification ought to be free to offer demon- 
strated ability or knowledge in clinical psychology in lieu of 
the Ph.D. degree. The American Psychological Association 
does not insist on the Ph.D. degree as an unconditional quali- 
fication for membership ; a candidate may offer the equivalent 
of the degree in terms of demonstrated ability to pursue work 
of high character in psychology. There seems neither need 


nor justification for greater rigor of qualifications in the sub- 
ordinate field. 

















-——_— 








MINOR STUDIES FROM THE PSYCHOLOGICAL 
LABORATORY OF INDIANA UNIVERSITY 


VI. THE INFLUENCE OF (a) INADEQUATE SCHOOLING 
AND (b) POOR ENVIRONMENT UPON RESULTS 
WITH TESTS OF INTELLIGENCE! 


By Lue_ta WINIFRED PRESSEY. 


I. Problem. The present paper is essentially a continuation 
of two studies which have already been briefly reported in this 
JourRNAL; the first dealt with the comparative intelligence of 
country children and city children, and the second compared 
in a similar fashion children from good and from poor homes.’ 
In these studies two group scales applicable from the third 
grade through high school were used. In each instance the 
test findings were looked to for aid in an understanding of 
various sociological and economic facts also discovered, in 
the course of the survey, with regard to the districts and 
families studied. And in each case the tests showed marked 
differences ; the city children rated distinctly above the country 
children, and children whose fathers were day laborers were 
found strikingly below the children of professional men in 
“native endowment.” 

In the course of the survey, however, many facts were found 
which seriously brought in question the validity of these find- 
ings. Thus most of the country schools were “six months 
schools ” and many of the children failed to attend regularly 
even during this brief period when the schools were open. 
The country district studied is, in fact, in a county notorious 
in the state for the inefficiency of its school system; the 
teachers are incapable and ill-trained, and the equipment miser- 
ably inadequate—in some instances the children were entirely 











1 The paper was presented, in slightly different form, at the meetings 
5 9 American Psychological Association, Cambridge, Mass., Dec., 
1919. 

2 Pressey, S. L., and Thomas, J. B., “ A Study of Country Children 
in (1) A Good and (2) A Poor Farming District by Means of a 
Group Scale of Intelligence,” Journal of Applied Psychology, Vol. 3, 
pp. 283-286, 1919, and Pressey, S. L. and Ralston, Ruth, “ The Relation 
of Occupation to Intelligence as It Appears in the School Children of 
a Community,” Journal of Applied Psychology, Dec., 1919, pp. 366-373. 

91 














92 PRESSEY 


without paper and pencils, the school work being done alto- 
gether on slates or on the blackboard! The group scale of 
intelligence given these country children presupposed (as do 
all such scales for use in the upper grades) a considerable 
degree of literacy and a fair reading vocabulary, and it in- 
volved the use of pencil and paper. It seemed, therefore, very 
possible that the poor showing made by these children might 
be the result quite as much of their inadequate schooling as 
of any lack in native intelligence. Somewhat analogous diffi- 
culties were encountered in evaluating the data obtained from 
children from homes at different economic levels. In good 
homes there is a background of general information and cul- 
ture, and a wealth of reading, which (it would seem) cannot 
but facilitate work on such tests. In the homes of the quarry 
hands and factory workers of the city studied, such influ- 
ences are notable for their absence; in fact, the parents are 
often illiterate. In short, these various accessory factors were 
felt to be so important, and so pervasive in their influence, that 
in the papers presenting these two studies it has been urged 
that results with a scale of performance tests (for the country 
children), and with tests for measuring home culture (as a 
check upon the “ occupation of parents ” data) would be neces- 
sary, before the findings with the tests of intelligence could 
be satisfactorily interpreted. 

It has been found possible, however, to get back of such 
special influences for the most part, by the simple expedient 
of examining the children from these different groups at their 
entrance to school, or soon after, with the “ Primer Scale.’ 
This brief scale of intelligence does not involve literacy, nor 
school training; children from country and city should thus 
meet the examination on equal terms. And the scale is given 
to the children at so early an age that the home influences 
just referred to migh* ‘ce expected to have operated to a much 
less extent than later; particularly is it important that the 
children are tested before they have learned to read readily— 
any effects coming from the superior opportunities for read- 
ing in the good home are thus largely avoided. Comparison 
of results obtained from the younger children, using the Primer 








3 Fora Serene of the scales mentioned in this paper see Pressey, 
S. L. and L. W., A Group Point Scale for eee | General Intelli- 
gence, Journal of ee. Psychology, Vol. Il, 1918, pp. 250-269 
“ Cross-out” Tests, vot dog ted Psychology, Vol. I1!, 1919, 
pp. 138-150; or Pressey, L. rief Group Scale of Intelligence 
for Use in the First Three Grades, J. Ed. Psychol., Sept., 1919. The 
material is also briefly discussed in the Bulletin of the Extension 
Division, Indiana University, Vol. V, No. 1, 1919. 


























MINOR STUDIES FROM INDIANA LABORATORY 93 


scale, with results from the older children, using the more 
usual type of examination, should in fact not only make clearer 
the relative mental ability of the groups studied; the com- 
parisons should aid in evaluating the scales involved, and also 
in estimating the general importance of the various environ- 
mental factors mentioned, in mental test work. 

Il. Results. The first data obtained for the present study 
with the Primer scale were from 183 country children six, 
seven, and eight years of age—all the children of these ages in 
fourteen country schools which had already been surveyed the 
year before with a scale for use in the upper grades (the 
“Schedule D” or “ Group Point Scale”) containing ten tests 
of memory, cont olled association, arithmetical reasoning, and 
so on. These results obtained the year before with children 
10-14 years old showed only 20% of the country children 
rating above the median for their age, when compared with 
norms obtained from city children. If, now, this poor show- 
ing of the country children were partly the result of their 
poor schooling, then the data obtained with the Primer scale 
should show a much greater per cent of the six, seven and 
eight year old country children above the medians for their 
age. As a matter of fact, only 22% of these younger country 
school children score above the median for their age, as deter- 
mined from city children! 

It would seem reasonable to conclude, then, that the differ- 
ences found by both scales, between country and city children, 
were real differences in intelligence. The findings tally well 
with the frequent assertion of sociologists that the more in- 
telligent individuals in the farming communities are constantly 
moving to the cities. It should be said in this connection 
that the country district studied is of a distinctly poor char- 
acter, the land being hilly and unproductive, and many of the 
people being “ poor whites ” from the mountains of Kentucky. 

The second group of data obtained with the Primer scale 
consists of results from 337 children six, seven and eight 
years old—all the children of these ages—in the schools of 
an Indiana city of about 12,000 inhabitants. The Primer scale 
was given these children in the first three grades as part of a 
total survey of the school system—the “Cross-Out” scale 
being given to children above the third grade. These last 
results (using again the data from children 10-14 years old—a 
total of 548 cases), were then grouped according to the occu- 
pation of the fathers. Children whose fathers were pro- 
fessional men (doctors, teachers, ministers, lawyers) were 
placed in one group; children whose fathers were executives 














94 PRESSEY 


(independent business men, foremen) constituted a second 
group. The third group was made up of the children of 
“artisans ” (skilled workmen, machinists, railroad engineers) ; 
and the fourth group consisted of the children of unskilled 
laborers. The per cent, in each group, scoring above the 
median for their age, was then determined. These per cents 
ran as follows: 


Occupation Group: Professional Executive Artisan Laborer 
No. of cases 57 105 138 248 
% above median for age 85 68 41 39 


It was felt, however, that these findings might not be truly 
a measure of differences in native endowment, because of the 
possible influence of the environmental factors mentioned 
above. As a check on these factors, results obtained with the 
Primer scale from the children 6-8 years old were, therefore, 
grouped in the same way, and the —_ compared by the 


same method. The per cents were as follows: 

Occupation Group: Professional Executive Artisan Laborer 
No. of cases 21 51 127 138 
% above median for age 79 60 54 38 


The results are again largely similar to the results obtained 
from the older children.‘ 

It might be argued, however, that the two scales were test- 
ing, in different form, somewhat the same special abilities: 
particularly is this argument plausible since both scales used 
in this last study (of the correlation of occupation with in- 
telligence) are “cross-out” scales, and are largely similar 
in general scheme, presentation and problem. To obtain some 
light upon this question the results with the Primer scale were, 
therefore, analyzed by test. 

The first test of the scale consists of groups of dots, each 
group of dots making a pattern; there is, however, one dot 
in each group which is outside of or spoils the pattern—this 
dot the children are to cross out. The second test is made 
up of squares, each square containing two objects which are 
similar, in some important way, and one object which is dif- 
ferent from the other two: this different thing the children 


4 It should be pointed out that the occupational groups are of differ- 
ent sizes, and that the smallness of the professional groups, in both 
instances, besides lowering the reliability, decreases the dispersion of 
the distribution and so exaggerates the superiority of this group as 
measured from the median of the entire group. For comparative 
purposes, however, as used here, the method serves well enough. 




















MINOR STUDIES FROM INDIANA LABORATORY 95 
are to cross out. The third test shows at the top of the page 
four forms—a triangle, a square, a cross, and a circle. Below 
are squares or “ boxes” containing “ blocks ” which are to be 
fitted into the four places at the top of the page: but in each 
“box” there is one piece which will not fit in—and this 
extra piece is to be crossed out. The last test shows pictures 
in each of which there is something wrong—this wrong part 
is to be crossed out. 

The second and last tests thus involve considerable infor- 
mation: these two tests might, then, be influenced by home 
environment. But it is hard to see how such environmental 
factors could operate to as great an extent in developing a 
child’s ability to recognize dot patterns, or assist him in dis- 
criminating the geometrical forms of the third test. It should, 
therefore, be possible, by comparing results on the four tests, 
to form something of an estimate of the “ reliability” of the 
measures obtained: in so far as the findings are constant from 
test to test of the scale used, it might be possible even to infer 
that similar findings would be obtained if other tests still were 
employed. The following table shows the per cent of children 
in each occupational group scoring above the median for their 
age, on each of the four tests: 


Occupational Group: Test 1 2 3 4 
Professional 68 70 72 71 
Executive 62 62 58 6l 
Artisan 50 61 51 54 
Laborer 40 38 42 47 


As will be seen, the per cents are distinctly constant from 
test to test. The differences are quite as great on the two 
tests which we would expect least influenced by home environ- 
ment as they are on the two tests we would expect most 
sensitive to such influences. 

Since then, the findings obtained from the two scales are 
consistent, and the findings are consistent from test to test, 
it seems reasonable to infer that the differences found between 
the occupational groups are probably true differences in a 
fundamental, underlying general intelligence or native endow- 
ment. And if we may assume that intelligence is hereditary 
we may argue back to the conclusion (regarding the parents 
of these children) that there is a positive correlation between 
occupational level and native ability—-in general, at least, people 
find the level of work for which they are capable. 

III. Discussion. The writer does not wish to press these 
conclusions. It is evident that all these tests are pencil and 

















96 PRESSEY 


paper tests. The country children, young as well as older, 
are more shy with strangers than are city children. The 
children from well-to-do homes often have nursery games 
somewhat analogous to the tests. 

But in so far as the results are valid, as thus interpreted, 
they have further interesting bearings upon certain general 
problems of mental measurement which should be pointed out. 
In the first place, these successive agreements between the 
scales used contribute to the validation of both types of exam- 
ination. The scales for use with the older children stand 
out as, for the most part, untouched by environmental factors 
which might very well be expected to influence, in an illicit 
way, the findings. The substantial agreement of the Primer 
scale with the Standard tests included in the scales for the 
upper grades is evidence to show that the Primer scale is 
also measuring “ general intelligence.” More interesting still, 
however, is the way in which there emerges, from these vari- 
ous comparisons, the outline of a unitary “ general ability.” 
The data is obviously most inadequate as the basis for any 
inferences regarding such a large problem. But the writer 
cannot but feel that a large number of such simple and direct 
comparisons, between a variety of groups and using a variety 
of tests, are more needed at present in the study of general 
and special ability than more elaborate mathematical analyses 
of a relatively small amount of data. 

Summary 

The paper reports comparisons of (a) country and city 
children, and (b) children from different economic levels, by 
means of a group scale of intelligence applicable to the first 
three grades. It was found that 

(1) 22% of the country children 6-8 years old score above 
the median for their age made by city children. 

(2) Children of professional and business men rate dis- 
tinctly above children of laboring men and mechanics. 

(3) Similar results were found in surveys by means of 
scales applicable to the older children. It is, therefore, argued 
(a) that these differences previously found were differences 
in innate ability, not in schooling or home culture, and (b) 
that there was some general factor (presumably general men- 
tal endowment) independent of the particular tests used, with 
respect to which these groups differed. 




















VII. FIRST REVISION OF A GROUP SCALE DESIGNED FOR 
INVESTIGATING THE EMOTIONS, WITH 
TENTATIVE NORMS. 


By S. L. Pressey anp O. R. CHAMBERS 


1. The Tests. In the June number of this JouRNAL* cer- 
tain “tests” were described intended for the investigation of 
emotional interests and distractibility. The tests were shortly 
after given to three small groups, one consisting of college 
students, another composed of girls from the State Industrial 
School, and a third made up of dementia praecox cases from 
two hospitals for the insane. The data were decidedly meager, 
but served nevertheless to indicate faults so marked as to make 
it seem wise to revise the tests before experimenting with 
them further. The tests as thus reconstructed may be de- 
scribed briefly as iollows: 

Test I. Affective Spread and Displacement: The test con- 
sists of 25 lists of words; each list contains five words, making 
a total of 125 words in all. All but 25 of the words name 
things more or less unpleasant. The subjects are told to read 
through the list and cross out every word that 1s unpleasant 
to them. There is no time limit, every subject being given 
time to finish. After the last subject has finished the last 
line the directions are that the group is to go through the 
list again, and draw a line around the one word in each line 
which is most unpleasant. In scoring, the total number of 
words marked unpleasant is first counted and used as a meas- 
ure of affective spread or tendency to emotionalize. The 
number of lines in which the subject chooses as most unpleasant 
a word other than the word so chosen by the most of the 
average cases (that is, the modal word) is then counted and 
the sum used as a measure of emotional peculiarity or dis- 
placement.’ 








1S. L. Pressey and L. W. Pressey, “ Cross-out” Tests, with Sugges- 
tions as to a Group Scale of the Emotions, Journal of Applied 
Psychology, Vol. II, 1919, pp. 138-150. 

2 Those who are familiar with Freudian terminology and theories 
will understand at once from the name of the test the general notion 
back of it, and the type of abnormal mental condition to which it is 
hoped the test will be sensitive. In fact, the five tests of the “scale” 
might, not altogether inaptly, be described as an attempt to investigate 
Freudianism experimentally. 

97 














98 PRESSEY AND CHAMBERS 


The first five lines of the test run as follows: 

1. disgust fear sex suspicion aunt 

2. roar divorce dislike sidewalk wiggie 

3. naked snicker wonder spit fight 

4. failure home rotting snake hug 

5. prize gutter thunder breast insult 

Test II. Emotional Distractibility: The test consists of two 
parts: the first half is a paragraph of very commonplace 
and stupid reading matter, with 20 irrevelant words scattered 
in through the test; the subject is to read through the passage 
and cross out all irrelevant words. A rigid time limit of 
one minute and twenty seconds is set. At the end of that time 
the subjects are told to stop and go through the second para- 
graph in the same way. This second paragraph is sown in 
the same way with 20 irrelevant words. But it is a very grue- 
some description of a medieval execution. The score con- 
sists of the number of extra words missed in the first passage 
less the number of extra words missed in the second, the 
idea being that the emotional excitement of the second passage 
should cause the subject to overlook more irrelevant words 
here.* 

The first five lines of each passage are given below: 

This evening’s the “ Herald” says that the Milton property 
east of 3rd Street was sold this man morning to Smith and 
Cooper out of Chicago. It seems that is Smith has been, 
for some time, looking for a good poor piece of land in the 
business heart part of town upon rock which he might build 
another boat of his chain of 10c stores. 

In the past time the most horrible and terrible forms of 
punishing crime were far common. Taunton tells many of an 
execution for treason under the most cruel and revolting in 
conditions. The man was hanged for three minutes, then, 
when his struggling wits began to decrease, was cut down, 
stripped, and his abdomen wide. 

Test III. Moral Discrimination and Experience: The test 
is superficially somewhat similar to test I. It consists of 25 
lists, each of five moral terms. The subjects are told to go 





3 The test was developed on the basis of some unpublisheed work done 
by one of the writers (Dr. Pressey) some years ago at the Psycho- 
pathic hospital with dementia praecox and psychopathic personality 
cases, but dates back ultimately to a card sorting test with pictures for 
distraction described by Boring (Boring, E. G. “Learning in Dementia 
Praecox” Psychological Monographs, Vol. 15, 1913, No. 63, pp. 101). 





Tne sae ENTE aIRY 





wn 


STI FET EA 


ae: LAR AI Sy 











ibe? 


SLY SET aa ‘ 


Pm: ere 











MINOR STUDIES FROM INDIANA LABORATORY 99 


through the lists and cross out in each list the thing that they 
consider worst. There is no time limit. The subjects are 
next told to go through the list again and draw a line around 
the wrong act or idea which they consider most common. 
The scores consist of the number of deviations from the most 
commonly chosen worst act and the most frequent sin. The 
idea has been that it might be illuminating to set over against 
each other moral and intellectual or experiential judgments 
in this way. As a matter of fact the judgments as to most 
common sin have proven most interesting.* 

The first five lines of the test run as follows: 

1. insulting, quarreling, mislead, hurting, carefulness. 

2. borrowing, stealing, gambling, honesty, begging. 

3. hate, rudeness, liking, dislike, fighting. 

4. drunkenness, temperance, cursing, flirting, beating. 

5. religious, crossness, smoking, stealing, swearing. 

Test IV. Free Association: The test consists of a list of 
25 words in capitals, each word in capitals being followed by 
a list of five other words in small letters. The subjects are 
told simply to go through the lists and draw a line through 
the one word in each list which is most closely connected in 
their minds with the word in big letters at the beginning of 
the list—they are to cross out the word which they most 
naturally think of in connection with the first word. There 
is no time limit. The score consists simply of the number of 
variations from the most common associate.* 





4 The test is an attempt to put in reasonably satisfactory and objective 
form an ethical discrimination test. It is therefore related to tests 
of this general nature described by Healy, Tests for Mental Classifi- 
cation Psychological Monographs, No. 2, Vol. 12, 1911 and Guy Fernald, 
The Detective Delinquent Class: Differentiating Tests, American 
Journal of Insanity, Vol. 68, No. 4, April, 1912. 

5 The test derives directly from the Kent-Rosanoff article (Kent, 
Grace Helen and Rosanoff, A. J., A study of Association in Insanity, 
American Journal of Insanity, Vol. 67, Nos. 1 and 2, 1910). The list 
of words in capitals is from the Kent-Rosanoff list of 100 words and 
most of the other words used are from their list of associates. As will 
be seen, the list begins with one association of very high frequency, 
according to their tables, in each line and goes down to absolute 
irrelevancy as near as the writers could make it. 

It should also be mentioned that the writers have asked, after the 
associations have been marked, that the subjects go through the lists 
again and draw a line around each word that means “something to 
eat or drink, or something to wear, or a part of the body;” there are 
25 such words in the lists. This part of the test was planned as merely 
a rough intelligence test for check on the general mental level of the 
groups which might be investigated. It is perhaps better omitted; the 
omission of this part may affect results on the last test, however. 














100 PRESSEY AND CHAMBERS 


The first five lines of the test run as follows: 

1. BLOSSOM nice flower pour poison cheese 

2. LAMP fear cheer match light dogs 

3. BATH nakedness hen water danger paper 

4. KING dog tyrant fish queen grade 

5. SLEEP midnight rest beautiful worry baseball 

Test V. Emotional Memory: The test consists of a list 
of one hundred words, fifty of which have occurred in the 
previous tests, and fifty of which have not. Of these fifty, 
twenty-five have been chosen as emotional and twenty-five 
as unemotional. The subjects are told to go through the list 
and cross out all the words which they think have oc- 
curred in the previous tests. Two scores are obtained: (a) 
excess of emotional over unemotional words correctly remem- 
bered, and (b) excess of emotional over unemotional words 
which were marked as remembered but which did not occur 
in the previous test.® 

The first five lines of the test run as follows: 
fear finger paper story nude rose east brutal slashed 
liquid business hacked crave Smith hate screamed piano 
ground dollars author belly ripped rock yards pears flirting 
railroad vomit cow horrible seduce trust mind gloomy water 
lover funeral tall rape shrieked parts pencil ghastly 

2. Nature of the Revision. The general nature of the 
revision may be very briefly indicated. The four tests first 
experimented with have already been described in the pre- 
vious article referred to above (note 1). These tests were 
given to some thirty college students, and about the same 
number of girls at the state reform school and dementia prae- 
cox cases at two state hospitals.’ Study of results from 
these three groups led to the following general conclusions. 
In the first place, tests for use with such widely different 
groups (and particularly group tests for use with the insane) 





® The notion of the test derives from the experience of one of the 
writers (Dr. Pressey) as subject for an experiment by Tolman; among 
other subjects Tolman investigated the influence of affective toning 
upon memory (Tolman, E. C., and Isabelle Johnson, Am. J. Psych., 
Vol. 29, 1918, pp. 187-195, but the subject, is of course an old one with 
a very considerable literature. 

tT Acknowledgments are due to Dr. Kenosha Sessions, superintendent 
of the Indiana Girls’ School, to Dr. Max Bahr, of the Central Indiana 
Hospital for the Insane, and to Miss Hazel Hansford, psychologist and 
field worker at the Southeastern Indiana Hospital for the Insane, for 
their kindness and help in securing the data. 



































1 Saar eT araoe 





MINOR STUDIES FROM INDIANA LABORATORY 101 


cannot be time-limit tests. In the second place, every effort 
must be made to use words which will be known to everyone; 
limitations of vocabulary must be carefully considered, par- 
ticularly in working with delinquent groups. In the third 
place, some method is desirable by which response in chance 
fashion to a test may be found out ; otherwise, in scoring papers 
from psychotics, it is often impossible to tell erratic chance 
reactions from erratic thinking which is nevertheless on the 
problem. 

As a result of these considerations only one time limit 
test was kept in the revised form presented above. Com- 
parison of girls’ school and college results served to indicate 
at least the most gross differences in vocabulary. And a 
check on chance reaction was sought by the use of “ jokers.” 
That is, in each line of the “ what is worst” test there is one 
virtue, and in each line of the “ unpleasant ” test there is one 
word which is either positively pleasant or at least not marked 
as unpleasant by any one in either the college or the girl’s 
school group.’ In scoring the tests these jokers are first 
glanced over, and any records showing an appreciable num- 
ber of responses on the jokers thrown out. In an effort to 
obtain the maximum amount of information from each test 
the first, third, and fourth tests are made to yield double 
scores; the device appears to work very well (in fact, the 
writers seriously considered making one test yfeld three scores) 
and would seem of some general usefulness. 

The changes made in the individual tests cannot be gone 
into in detail. It may be said shortly that the data obtained 
from the three groups mentioned above were analyzed very 
closely, elements which appeared differential seized upon and 
more like them added, and the sensitivity of the tests increased 
by proper grouping of items. The materials were also arranged 





8In these two tests the jokers are put into the test according to a 
set scheme, in order to facilitate scoring; the joker is fifth in the first 
line, fourth in the second line, third in the third line, and so on, the 
series beginning over again in the sixth line. This scheme is not 
readily hit upon by a subject. And it is also (a more important point) 
not the sort of thing a subject might fall in with unwittingly as the 
result of an automatism; a subject might react uniformly to the first 
word in every line, or the middle word, he might take progressively 
the next word, but he would hardly be likely to work backward in this 
way. These considerations are particularly important in the first 


test, where not only the jokers, but all the classifications, run in this 
way. 











102 PRESSEY AND CHAMBERS 


more systematically, and so as to permit of more ready 
analysis. One test was dropped altogether, one test of the 
“ Cross-out ” scale® radically made over and included in this 
series, and the free association test added. 

The general nature of the changes made may be illus- 
trated by the revision of the first test. The first results made 
it evident that there are certain words, such as murder, which 
practically everybody,—psychotic, delinquent, or college stu- 
dent,—consider unpleasant ; other words such as smile every- 
one considers pleasant. Such words are evidently of no value 
(except as “ jokers”); they are not differential. Therefore 
in the final form no word of the first test was kept unless 
it was considered, by more than 20% and less than 80% of 
both the college group and the girls’ school group, to be un- 
pleasant ; words of about the same percentage of unpleasant- 
ness were also put together in the same line in order to make 
the test sensitive to small difference of opinion or affective 
attitude. But it was felt that this was not enough. It was to 
be expected that the delinquents and psychotics would differ 
from normal folk not merely by more random choices; the 
atypical cases would show instead peculiar but consistent 
trends. A guess was therefore made as to what trends would 
be interesting to investigate, and words of five types chosen, 
words which would be unpleasant because of their relation 
to disgust, fear, sex, or self-feeling, and the jokers. One of 
each type of word was put in each line, after the scheme used 
with the jokers; the key to the arrangement is given by the 
first line of the test, as shown above. 

3. The Tentative “Norms.” The tests, as thus revised, 
were then given to a total of 101 college students, 49 men and 
52 women. From these results the following tentative norms 
were worked out: 

Test I. (a) per cent of each sex, and of the entire group, 
considering each word unpleasant; (b) number of deviations, 
in choice of the most unpleasant word, from the modal choice. 

Test II. differences between number of extra words cor- 
rectly crossed out in unemotional and emotional passages un- 
emotional less emotional). 

Test III. (a) per cent, for each sex and for the entire group, 
considering each wrong act worst, and most common; (b) 
number of deviations, in choice of the worst and the most un- 
pleasant, from the modal choice. 





® See the previous article mentioned in note 1 above. 

































































MINOR STUDIES FROM INDIANA LABORATORY 103 


Test IV. (a) per cent, for each sex and for the entire 
group, choosing each word as an associate; (b) number of 
deviations, in choice of the associate, from the modal choice.” 

Test V. (a) excess (or the reverse) of emotional over un- 
emotional words correctly remembered; (b) excess (or the 
reverse) of emotional over unemotional words marked, but 
not actually occurring in the previous tests. 

4. Purpose of the Tests. The writers realize, very de- 
cidedly, the crudeness of the tests, and the inadequacy of the 
data accumulated so far. But the data already obtained will 
serve at least for a rough first orientation in dealing with 
any further material which may later be accumulated. Re- 
sults from a group of factory hands or other relatively un- 
skilled laborers, from a group of colored adults, a group of 
delinquents, and a pathological group (preferably neurotics 
or early dementia praecox) are particularly desired. The 
writers are not so situated that such data are readily obtain- 
able ; it is with the hope that others, who are already working 
with such special groups, may be interested to thus experi- 
ment that the present papet is being published. The test forms, 
and tabulations to date, will be gladly furnished to any who 
may be interested to do such work. 

And as an indication of what can be done in the develop- 
ment of such data the writers wish to present very briefly, 
in closing, the results of an analysis, for sex differences, of 
the responses on the first test. The subjects were told to 
cross out the words which were unpleasant to them. It was 
found that 55% of the women marked more words as un- 
pleasant than did the median man—that is, the difference 
was negligible. However, the per cent, for each sex, marking 
each word as unpleasant was next found, the twelve most 
differential words located, and the number of these words 
marked as unpleasant by each man and each woman counted. 
It was found that 94% of the women marked more of these 
words unpleasant than did the median man! 

We do not, of course need mental tests for the distinguish- 
ing of the sexes; and it may be said that the writers have 
tried, so far as possible, to avoid items on which sex differ- 
ences might appear so that separate sex norms would not 








1°The number of correct responses, in making the “things to eat 
or drink, things to wear, and parts of the body” were also tabulated; 
but the measure would seem of little value. 

















104 PRESSEY AND CHAMBERS 


be necessary."* But this bit of analysis will show something 
of the possibilities of development contained in the tests; 
the writers see no reason for doubting that differential groups 
within each test, which shall prove little less effective, may 
be found in working on neurotics and psychopathic delinquents. 
At least there is sufficient evidence to make the experiment 
seem worth while. 


A CORRECTION 


In a minor study entitled “ The Efficiency of the Group 
Point Scale in Prognosticating Success and Failure in Junior 
High School,” in the December number of this JouRNAL, a 
prognosis chart or percentage correlation table for prognostic 
purposes was presented. By some mischance this chart was 
confused with a similar chart for correlation of score with 
Teachers’ Estimates. The correlation coefficient given in the 
article was correct. But the chart should read as follows: 


Division into fifths—test score— 
2 3 4 5 


Division into Vv 0 0 30 20 50 
fifths IV 15 20 20 25 20 
according III 15 25 15 30 15 
to school II 35 25 20 10 10 
marks I 35 30 15 15 5 


In each array one is lowest and five highest. 
S. L. Pressey. 





11 For the most part this has been accomplished. Thus on the fourth 
test the two sexes differ in only two instances, in their selection of the 
worst sin. But nevertheless only 22% of the women differ as much 
as the median man, from the modal selection—a result due perhaps to 
wider experience on the part of the men. It is also interesting that 
half again more sex words were selected by the women than by the men 
as most unpleasant, and that the men exceeded the women in selection 
of the “ fear” words. 

It should be added in this connection that the tests are by no means 
wholly a masculine production; each writer frequently consulted his 
wife, and the majority of the items of the first form were selected by 
Mrs. Pressey,—whose suggestions and help throughout the entire course 
of the work have been of the greatest value. 

















BOOK REVIEWS 


C. E. SeasHore. The Psychology of Musical Talent. Silver, Burdett 
and Company, Boston, 1919, p. xvi+288. 


In this volume Professor Seashore has assembled his tests on 
musical talent, which he presents with typical results in simple and 
untechnical language. Though the volume is addressed to students 
of applied psychology, the author has evidently had also in mind the 
music supervisor with little psychological training, for one misses a 
searching analysis of the data collected, and the treatment can hardly 
be called systematic though a very definite scheme of classification 
underlies the work. 

Musical talent is considered under five heads: musical sensitivity, 
musical action, musical memory and imagination, musical intellect and 
musical feeling. As a basis of analysis the author accepts four auditory 
attributes—pitch, intensity, duration and extensity. Tests for the first 
three of these “senses” are described and their elemental importance 
magnified. “Pitch is to the musician what color is to the artist—his 
medium of expression.” Accordingly a test of pitch-discrimination is 
taken to be a basic test of musical ability. The experiments seem to 
indicate that the physiological limit can be attained after a brief prac- 
tice period and that it does not change with age or further training, 
nor vary with respect to sex. If pitch discrimination alone is poor we 
can predict a corresponding inferiority in all its derived factors— 
though good pitch discrimination does not insure excellence in these 
other factors. As a result of testing large numbers for pitch-dis- 
crimination, Seashore has reached the following practical conclusions. 
A person who discriminates 3 vibrations or better at the level of 435 
d.v. may become a musician. One whose discrimination falls between 
3 and 8 d.v. should have a plain musical education; between 9 and 
17 d.v. one should have it only if a special inclination for some kind 
of music is shown; one whose discrimination requires 18 vibrations 
or more should have nothing to do with music. 

A normal distribution curve indicates the percentage of persons 
giving varying percentages of right judgments, the largest number, 
something over 30 per cent, giving 85 per cent of right judgments in 
these tests. Tests are also recorded for the range of pitch, the results 
showing increased sensitivity from the level of 64 d. v. to 128 d. v. with 
an at first gradual and then more rapid decrease from 256 d.v. upwards. 

With regard to the “sense” of intensity, interferences in hearing 
are briefly discussed together with the phenomena of tonal gaps and 
tonal islands. The use of the audiometer is described and also the 
pitch-range audiometer for intensities at different pitches. The stand- 
ard of acuity for a very good ear allows the faintest sound of the 
instrument to be heard at each pitch-level from 200 to 3200 d.v., but 
there is no indication that these intensities are either subjectively or 
objectively equal. If they were, the results would be at variance with 

105 











106 BOOK REVIEWS 


the normal curve, established by Max Wien.’ Little difference is found 

between the acuity of adults and children. No improvement was ap- 

rent after training and blind persons were found to possess no better 
earing than persons with normal vision. 

In studying the sense of time the author assumes a motor theory 
of response. Discrimination as fine as 1/100 of a second is extraor- 
dinary, while a record as poor as 1/2 of a second is equally rare. 
Though children do less well than adults in these tests, improvement 
with practice is attributed to a growing knowledge of the time-process. 
The very slight correlation of precision in hearing the durations of 
tones or of short intervals with lessons in music leads to the con- 
clusion that either musical training does not improve a capacity so 
elemental as the time sense, or that those with a good time sense are 
no more apt than others to be selected for a musical education. Pitch 
and time are found to be fairly independent variables; the need of a 
good time sense in music being regarded as contingent. 

The sense of rhythm is reduced to five fundamental capacities; the 
senses of time and of intensity, auditory and motor imagery and a 
motor impulse for rh ; its measurement is correspondingly reduced 
to these elements. Timbre is described as a complex of pitches. The 
criteria for ju ts of consonance are listed as blending, smoothness 
and purity. é instruction for the consonance test is to “give the 
decison on blending alone if the degree of blending (in the two- 
clang comparison) is perceptibly different; if not make the decision 
on smoothness—and, if there is no difference in either smoothness or 
blending base the decision on purity.” A normal order resulting from 
experiments with the piano is recorded and two comparison tests, a 
longer and a shorter, are described. The results of these tests show 
them to be independent of the age of observer and, while improve- 
ment with training is evident, it is not so noticeable as one might 
suppose. The test may, therefore, be used to advantage before the 
child knows anything of music. 

In the discussion of auditory space, the ability of the ears singly and 
together in determining the direction of a source of sound is described 
and explained in terms of difference in intensity. The sense of 
extensity “varies exactly parallel with pitch; there is an inseparable 
duality.” “For the purpose of rating talent it is therefore unneces- 
sary to concern ourselves with the isolation of the sense”; a conclusion 
which must seem of doubtful validity to those who have followed the 
recent studies of hearing; particularly Rich’s investigations of the 
volume threshold which demonstrate that its limen is quite different 
from that of pitch. 

Seashore differentiates volume and extensity, preferring to use the 
former term as a combined effect involving several factors such as 
extensity, intensity, timbre and reduplication of sound. 

the subject of motor control, the basic motor capacities are 
outlined with respect to time as motility, timed action, response to a 
simple signal, action upon choice and serial action; with respect to 
fmovement; precision, discrimination, strength and endurance. A 
variety of tests are described employing the chronograph and chrono- 
scope, the simple tapping test being adopted as an index to motility. 

Under the head of musical action Seashore describes the use of his 
tonoscope, and some of the chief results secured with it in measuring 
voluntary control of pitch. 

In considering musical imagery and imagination a somewhat exag- 


1 Pfliiger’s Arch. f. d. ges. Physiol., 1903, 97, pp. 1 ff. 






































BOOK REVIEWS 107 


gerated importance is attached to the auditory image. The results of 
a questionnaire indicated that “as a rule musicians who rate themselves 
low offer some excuse, explaining that they suffer from defective 
capacity in this particular; that they have neglected to develop it; that 
they have been engaged in some sort of musical business which does 
not make it mecessary, etc.” Whatever importance may attach to 
imagery it can hardly be doubted that the specific imagery of a special 
sense is less significant in indicating a particular talent of appreciation 
or execution than was once supposed to be the case; and musicians 
who deplore their lack of auditory imagery may be unduly impressed 
when their attention has been called to some apparent deficiency in 
their ability to hear inwardly. 

As for musical imagination Seashore’s treatment is suggestive rather 
than precisely analytical. “Imagination proper” he writes, “is not a 
specific mental process, as is sensation or perception, but is rather a 
designation for certain group functions of images, associations, 
thoughts, feelings, and efforts in countless permutations.” A statement 
equally vague deals with the language of music: “It is to the credit 
of language if it convey one specific idea and that only; it is to the 
credit of music if it lead to a richer self-expression transcending the 
bounds of defined concepts and literal form.” In the opinion of the 
reviewer, music transcends “the bounds of defined concepts” no more 
than does verbal language, but the definition of the musical concept as 
distinct from certain of the elements of auditory experience is lacking 
in this work. 

Of the remaining chapters the one on musical memory describes 
tests for memory-span and retention, and includes a discussion of 
learning curves and absolute pitch. With respect to musical intellect 
a sound statement is made that “ musical thought is a specialization in 
dealing with the problems which arise in music. Although the form 
and content of the thought are different, it requires the same kind of 
logical grasp as in mathematics or philosophy.” Yet no effort is made 
to clarify the peculiar form and content of music, as might —. 
have been done by reference to the work of Stumpf, Lipps, M. F. 
Meyer and W. V. Bingham. Indeed Seashore’s conclusions rest almost 
exclusively upon results obtained in his own laboratory. The chapter 
on musical feeling is equally vague. 

The final chapter deals with pedagogical hints and the advocacy of 
a consulting supervisor of music who may be competent to test the 
musical talents of children in the schools and to judge them in accord- 
ance with their capacities and interests. 

For all the technical ingenuity with which Seashore has attacked the 
important problem of musical diagnosis one feels that somehow the chief 
feature of musical talent has escaped his “dragnet;” this being the 
ability to conceive and to think in terms of the musical interval. The 
neglect of the interval casts a doubt upon his “ basic” test of pitch. In 
discriminating pitch within the range of 3 vibrations at a level of 435 
d.v. we are not dealing with a musical interval but with a mere 
difference of “height.” The difference between a semitone above 

5 d.v. in just intonation (464 d.v.) and the same note in tempered 
intonation (460 d.v.) is greater than the limit which Seashore’s test 
makes the basis of a musical diagnosis. The reviewer believes that the 
attitudes wherein one judges height and interval are quite different and 
that they rest upon different “senses.” In view of Rich’s work on the 
volume threshold it seems highly probable that the independent attribute 
or “sense of extensity” which Seashore neglects as being “ exactly 














108 BOOK REVIEWS 


parallei with pitch ” is nevertheless the elemental foundation upon which 
our judgments of interval and the logic of music are based. In the 
report on pitch discrimination at different levels of pitch, we find a 
curve which indicates that in terms of fractional parts of a tone keenness 
in the sense of pitch remains approximately constant from 256 d.v. to 
2048 d.v. This curve therefore parallels Rich’s limens in that a certain 
fraction of the vibration appears to correlate with a just noticeable 
difference of “ pitch,” as does a similar fraction with a just noticeable 
difference of volume. It would thus appear possible that Seashore’s 
pitch test involves both pitch-height and pitch-interval. Some observ- 
ers judging pitch-height may discriminate differences of 3 d.v. and 
better at the level of 435; while others influenced by the attitude of 
interval-difference may tend to require differences exceeding the 
limen of volume, which at this level would be about 9 d.v. 

How far this neglect of a direct test for the sense of interval may 
impair the results of Seashore’s diagnosis, it is not easy to say. A 
corrective is of course to be found in several of the accessory tests 
which measure the sense of interval indirectly, such as those for 
consonance and those employing the tonoscope, in singing intervals 
and in voice control; but the emphasis which the author places upon 
the test of pitch can not seem just to those who define pitch as height, 
and who must therefore deny that “pitch is to the musician what 
color is to the artist—his medium of expression.” 

Cornell University. R. M. Open. 


J. W. Brivces. An Outline of Abnormal Psychology. R. G. Adams & 
Co., Columbus, Ohio, 1919, p. 127. 


This book is intended to be useful to “those medical students and 
students of social service who desire a general survey of this field but 
who have insufficient time for a regular supervisec course or for 
extensive reading of the very much scattered literature.” It is also 
“to serve as a guide for students of abnormal psychology in the 
absence of a comprehensive text-book.” 

The book is found, however, to be more than a mere guidebock. For 
example, in the second part, on mental diseases, a concise and carefully 
arranged table of symptoms is given under the discussion of each type 
of insanity. The name, “An Outline of Abnormal Psychology ” may 
be taken literally: the book is written actually in outline form. The 
first part deals with abnormal phenomena in general. Under definitions 
and classification, Wernicke’s classification only is given. This, 
however, is not held to in the author’s discussion of the subject, which 
begins with sensation, goes on through consciousness and attention, 
memory, association, judgment, orientation feeling and temperament, 
to instinct and emotion, innate action, and acquired action, much in the 
manner of the ordinary text-book of normal psychology. The section 
ends with chapters on intelligence, personality, and sleep, dreams, and 
hypnosis. Each chapter contains a full list of all possible abnormalities 
in the different manifestations of mind, together with their technical 
nomenclature. Under intelligence, the most generally used methods 
of measuring intelligence are mentioned. The second part is on mental 
disease. Feeble-mindedness, usually given at least a chapter in a 
work on abnormal psychology, is not treated here, except as one kind 
(cretinism) finds a place under thyroigenous psychoses. All the other 
varying forms, such as mongolianism, and microcephalis which certainly 
ought to be described in a book on abnormal psychology, (at least if 














BOOK REVIEWS 109 


that term is taken in its broadest sense to mean not-normal psychol- 
ogy) are not even mentioned but are summarily dismissed when the 
author distinguishes between amentia or feeble-mindedness, and 
dementia, explaining that “ The former is an innate defect, while the 
latter is the result of a mental deterioration.” No general classifica- 
tions of the psychoses are given, except the author’s own, as implied 
by his chapter headings, which run as follows: Dementia Precox, 
(Paraphrenia, Paranoia), Manic-Depressive Insanity, Dementia Paraly- 
tica, The Alcoholic Psychoses, Morphine, Cocaine, and Other Drugs, 
The Presenile and Senile Psychoses, and The Symptomatic Psychoses. 
But in the third part, on Borderline Diseases, Freud’s, Sidis’s, and 
Kraepelin’s classifications of the neuroses are presented. 

Throughout the work-impartial consideration is given to many and 
widely differing theories of the various phases of abnormal psychology. 
Copious references are to be found after each chapter. One wishes 
that a comprehensive index might also have been given a place. 

This outline will probably be of greater assistance to those already 
fairly familiar with the subject, but who need help in clarifying and 
systematizing their knowledge, than to those who, by the reading of 
this book alone, would gain their whole information. The work is at 
once too technical and too abbreviated to be very meanirzful to the 
average laymen. Marjory Bates. 

Clark University. 


CarTER ALEXANDER. School Statistics and Publicity. Silver, Burdett 
& Co., New York, 1919, p. xix+-332. 


This little book is an outgrowth of the author’s teaching at the 
Peabody College for Teachers. It is produced with the avowed aim of 
aiding active superintendents of schools to adequately place before the 
public such statistics as show school needs and school achievements. 
It is also intended for use as a textbook by those who are “ engaged 
in training future superintendents.” Its clear, simple style and the 
abundance of illustrative material make it very serviceable for either 
purpose, and as a textbook its value is enhanced by the suggestions to 
instructors and by the exercises which follow each chapter. 

The book falls rather readily into three main divisions. The first 
of these, consisting of the first three chapters is largely introductory 
in value. The author points out in the first chapter the various types 
of errors and deficiencies commonly noted when good but unthinking men 
are tempted to use figures. The second chapter very practically tells 
how to collect data, and the third is an argument for knowledge of the 
technique of statistics on the part of school administrators. 

There can be no doubt in the mind of the reader after reading these 
three chapters that the author thoroughly believes in statistical, or at 
least graphic presentation of school facts to the public. At the same 
time he takes a very moderate stand on the question of the amount of 
statistical knowledge that is practically necessary. 

The second division, consisting of chapters four to eight inclusive, 
affords a very clear elementary presentation of statistical method. The 
fifth, sixth and seventh chapters are the most difficult in the book and 
the author admits that they will require little more careful study than 
the others, but despite this admission it would be hard to find a 
clearer or simpler exposition of the matters treated. Chapters six and 
seven, treating of measures of deviation and of relationship, are the 
most technical, and it is possible that in a majority of cases the busy 








110 BOOK REVIEWS 


superintendent could omit them without serious loss so far as their 
value in connection with his local public is concerned. If he is to take 
his work very seriously, however, they are very needful in order that 
he may be able to interpret his data completely. 

The remaining four chapters, forming the third division, present 
devices for making clear and striking presentations of data. These 
devices are well chosen, not only by reason of their effectiveness but 
also because of their ingenious use of materials and tools ready at hand 
even in a poorly equipped office. This practicality is further evidenced 
by the author’s statement that all but five of the seventy-nine cuts in 
the book were made by high school boys,—a demonstration of the 
possibilities of student assistance to the superintendent. 

An annotated bibliography is given, consistent in type with the rest 
of the book, its chief features being its freedom from cumbrousness and 
its choice, in the main, of readily accessible material. 

Clark University. Geo. Atten Coz. 


Wuutam H. Doo.ey. Principles and Methods of Industrial Education 
For Use in Teacher Training Classes. With an Introduction by 
Charles A. Prosser) Houghton Mifflin Company, Boston, 1919, p. 
257. 


Probably no educational scheme or problem is more in the public eye 
than that of vocational or industrial education. The paucity of skilled 
help in nearly all industries is such that the problem of present supply 
is a most perplexing and serious one. Dr. Prosser sums up the situa- 
tion in his Introduction to this book: “ Never in our history has there 
been such a keen realization of the dependence of production upon 
skill, and the part that wise methods of training have in cultivating 
skill. In our ways and means for meeting these increasing d 
we are at once fortunate and unfortunate; fortunate in adequate 
financial support for sound instructor training plans; unfortunate in a 
shortage of people to organize and direct them, and doubly unfortunate 
in a lack of organized practical material for use in instructor (teacher) 
training classes.” 

This book, as a whole, goes far to fill the place which Dr. Prosser 
notes above, and is an admirable compilation of contemporary writers 
and bulletins, and is drawn without stint from any source that seemed 
to furnish the material for the purpose in mind. 

It has been difficult for the vocational teacher-training personnel to 
get away from the academic point of view, and the sporadic efforts of 
the past and even of the present, lean too far away from practical s 
sense to meet the need of the industries for practical, not thoretical, 
men. The author has seemingly been fortunate to sense this most 
important error and has arrangedhis book in orderly fashion, and 
clothed it in understandable language that easily makes interesting 
many of the dry details of pedagogy which are essential for the well- 
trained trade teacher. 

The book deals with the following topics: the Value of Industrial 
Education, the Educational Needs of Trades and Industries, How men 
have been trained for Trades and Industries in the Past, Different 
Types of Industrial Schools, Organization of Industrial Schools and 
of Industrial Classes, an Industrial Survey, Principles of Psychology 
underlying Learning, General Methods of Teaching, General Methods 
for Teaching in Industrial Education, Methods of Teaching Shop 
Work, of Interpretation of Blue Prints and Shop Sketching, of Shop 

















BOOK REVIEWS 111 


Science, of Shop or Industrial Mathematics and of English and Manual 
Training compared with Industrial Education. 

A valuable feature of the book is the list of “ Questions for Discus- 
sion” that is given at the end of each chapter. This is followed by 
a “ List of References for Further Reading.” There is also an Appendix 
of forty-seven pages giving typical outlines and suggestive courses of 
study in various kinds of vocational schools. 

Taken as a whole, the work should prove valuable to the trade 
instructor and especially to those engaged in teacher-training for 
vocational schools. 


Clark University. Harry E. MIwiken. 








NOTES 


The Carnegie Corporation of New York has announced its purpose 
to give $5,000,000 for the use of the National Academy of Sciences and 
the National Research Council. It is understood that a portion of the 
money will be used to erect in Washington a home of suitable archi- 
tectural dignity for the two beneficiary organizations. The remainder 
will be placed in the hands of the Academy, which enjoys a federal 
charter, to be used as a permanent endowment for the National 
Research Council. This impressive gift is a fitting supplement to Mr. 
Carnegie’s great contributions to science and industry. 

The Council is a democratic organization based upon some forty of 
the great scientific and engineering societies of the country, which 
elect delegates to its constitutent Divisions. It is not supported or 
controlled by the government, differing in this respect from other 
similar organizations established since the beginning of the war in 
England, Italy, Japan, Canada, and Australia. It intends, if possible to 
achieve in a democracy and by democratic methods the great scientific 
results which the Germans achieved by autocratic methods in an 
autocracy while avoiding the obnoxious features of the autocratic 
regime. 

e Council was organized in 1916 as a measure of national prepar- 
edness and its efforts during the war were mostly confined to assisting 
the government in the solution of pressing war-time problems involving 
scientific investigation. Reorganized since the war on a peace-time 
footing, it is now attempting to stimulate and promote scientific 
research in agriculture, medicine, and industry, and in every field of 
pure science. The war afforded a convincing demonstration of the 
dependence of modern nations upon scientific achievement, and nothing 
is more certain than that the United States will ultimately fall behind 
in its competition with the other great peoples of the world unless 
there be persistent and energetic effort expended to foster scientific 
discovery. 





The Commissioner of Institutions and Agencies of New Jersey, 
Burdette G. Lewis, has recently extended the scope of the State 
Psychiatric Clinic to include psychological work in the correctional 
institutions of New Jersey. The work is at present organized as a 
section of the Psychiatric Clinic of the New Jersey State Hospital for 
the Insane, Dr. Henry A. Cotton, Director. Mr. Edgar A. Doll was 
appointed as the Department Psychologist in July, 1919, following a 
period of preliminary investigation concerning the application of 
psychological methods in correctional institutions. In August, 1919, 
Mr. W. J. Ellis was appointed assistant psychologist. The psychologi- 
cal staff also includes 2 volunteer assistants, Mr. Warren S. Prince and 
Miss Anna Gillingham. The psychological work in the correctional 
institutions of New Jersey is directly related to the classification and 
marking system recently installed as a basis for parole under the 
Division of Education and Parole of the Department of Institutions 
and Agencies, Calvin Derrick, Director. The functions of the 
psychological work are to make mental diagnoses of the inmates of 
correctional institutions and on the basis of psychological tests and 


112 

















NOTES 113 


measurements to recommend concerning the educational, vocational, 
disciplinary and parole treatment of delinquents and prisoners. The 
Army group test Alpha, supplemented by individual examinations, 
has been in use since February, 1919, and surveys of the four principal 
institutions have been made. Clinical psychological methods to supple- 
ment the group tests have been installed in the New Jersey State 
Prison, at the State Home for Boys at Jamesburg, and at the State 
Home for Girls at Trenton. Psychological analyses of industrial and 
vocational activities of the institutions are in process at the State 
Prison and at the State Home for Boys. It is expected that if the 
work continues to develop the Staff of the Psychological Section will 
be expanded as needs demand. 


The present program of psychologicai examining at the New Jersey 
State Prison includes an attempt to introduce scientific management in 
the problems of industrial and vocational assignments of prisoners. 
The purpose of this program is (1) to improve the possibilities of 
utilizing to the best advantage the reformatory influences which the 
Prison provides, (2) to improve the morale or mental attitude of the 
prisoners by having them assigned to work which is suited to their 
capabilities and best interests, (3) to improve the efficiency of instruc- 
tion in the vocational shops, and (4) to increase the effectiveness of 
the routine work in the Prison shops. 

To accomplish these ends the psychological section of the Psychiatric 
Clinic has instituted at the request of the Commissioner and the 
Director of the Division of Education and Parole, a psychological and 
industrial analysis of the Prison activities both vocational and indus- 
trial. This analysis is intended to form the basis of intelligent 
assignment of men to shops or tasks. The regular work of the 
Psychologist calls for individual analysis of each prisoner with respect 
to his general intelligence, mental responsibility, vocational aptitudes, 
and industrial qualifications. It is the hope of the Department to effect 
the coordination of this man-analysis and job-analysis in such a way 
that, knowing the capabilities of the man and the requirements of the 
job the welfare of the prisoners and their efficiency in their tasks may 
be materially advanced. 

Job-analysis. The Assistant Psychologist aided by a graduate 
student in psychology has conducted psychological and vocational 
analyses of numerous ‘jobs’ in the Prison. The results of this inves- 
tigation cannot yet be presented in full but the results for two jobs may be 
cited briefly. The Assistant Psychologist finds that the print shop, for 
example, is a highly specialized industry, calling for particular 
degrees of skill which are ordinarily attained only after long practice. 
The print shop also presents an industry whose various operations are 
highly differentiated with respect to the gradation of processes in the 
industry. The Psychologist, therefore, has found very clear depend- 
ence of success upon general intelligence and general education in the 
succession of tasks in the print shop. It is, therefore,, comparatively 
easy to predict a man’s degree of success in learning this industry 
simply on the basis of his general intelligence and general education. 
In addition it is possible in the print shop to measure the degree of 
specific aptitude and skill which men must possess for immediate 
success in the work of the shop. It is easily possible to apply mental 
tests such as alphabet sorting and visual discrimination, and tests of 
special forms of fatigue which would be applicable to the assignment 
of men to work in the print shop. 








114 NOTES 


Contrasted with the print shop we may cite briefly the results 
obtained in the shoe shop. In this shop the succession of tasks is not 
very sharply differentiated on the basis of general intelligence. The 
general level of intelligence in the shoe shop ranges from a minimum 
of mental age 11 years to a maximum mental age of about 14 years. 
The overlapping of intelligence from operation to operation is very 
great. Moreover,the educational requirements are practically negligi- 
ble. Therefore, in the shoe shop it is difficult to assign a man to a partic- 
ular task on the basis of his general intelligence or education. In the 
print shop, by contrast, the range of intelligence was from his minimum 
mental age of 11 years to a maximum mental age of 18 years. The 
succession of operations within that shop were found to be very 
clearly operated on the basis of general intelligence. In the shoe shop 
as in the print shop it is possible to devise tests of specific aptitude for 
the several operations demanded. A cutter, for example, must be able 
to cut up a piece of leather with a minimum of waste. A psychological 
test of form perception analogous to the jig-saw puzzle might be used 
for testing this ability. As yet the psychologist has not proceeded far 
enough with his work to devise the tests needed for such work. He is, 
however, at work on this problem. 

Man-analysis. Each prisoner is examined by the psychologist almost 
immediately after his admission to the Prison. The purpose of this 
examination is to determine general intelligence, character or person- 
ality make up, actual degree of education and potential educability, 
actual degree of industrial skill and potential vocational aptitudes. 
The psychological and educational tests are relatively well developed. 
It is therefore easy enough to measure a man’s general intelligence, 
education and degree of educability with a high degree of accuracy. 
Trade and vocational tests, however, are not yet sufficiently developed 
for immediate use at the Prison. Those tests for measuring industrial 
capability and vocational aptitudes which are now available in psychol- 
ogy are not specifically suited to the needs of the work at the Prison. 
It is therefore necessary to devise special tests of this character speci- 
fically suited to the present Prison conditions. It is planned to obtain the 
industrial history of each man and then to quiz him carefully regarding 
his present degree of skill in the major trades which he claims to have 
followed. It is also planned to examine him with vocational tests to 
determine his industrial aptitudes. He will also be questioned concern- 
ing his own ambitions and the industrial possibilities in the environ- 
ment to which he expects to go when released. 

Coordination of Job-analysis and Man-analysis. Specification cards 
will be designed for all the detailed jobs in the Prison and similar cards 
will be designed to summarize all the information obtainable regarding 
each man. These cards will call for the physical, mental, educational 
and industrial requirements of the job and the similar abilities of the 
man. It will then be possible, knowing the requirements of the job 
and the capabilities of the man, to assign a man to a job on a scientific 
basis. It is, of course, unlikely that this relatively ideal program can 
be achieved within the course of at least another year, but some 
actual progress is already being made. The assigning officer at the 
Prison is already using the results of the psychological examining in 
such a way as to enable him to place men more successfully , Wa 
heretofore. The general conduct of the work involved in carrying out 
this program is also definitely bringing to consciousness among the 
Prison officials both the needs and the possibilities for more efficient 
classification and placement. 
































Z| 











